500:(uORFs), secondary structure and the sequence context around the translation initiation site. A common start site is defined within Kozak consensus sequence: (GCC) GCCACCAUGG in vertebrates. The sequence in brackets (GCC) is the motif with unknown biological impact. There are variations within Kozak consensus sequence, such as G or A is observed three nucleotides upstream (at position -3) of AUG. Bases between positions -3 and +4 of Kozak sequence have the most significant impact on translational efficiency. Hence, a sequence (A/G)NNAUGG is defined as a strong Kozak signal in the CCDS project.
701:, which provides FTP download links and a query interface to acquire information about CCDS sequences and locations. CCDS reports can be obtained by using the query interface, which is located at the top of the CCDS data set page. Users can select various types of identifiers such as CCDS ID, gene ID, gene symbol, nucleotide ID and protein ID to search for specific CCDS information. The CCDS reports (Figure 1) are presented in a table format, providing links to specific resources, such as a history report,
1428:, Searle S, Farrell CM, Loveland JE, Ruef BJ, Hart E, Suner MM, Landrum MJ, Aken B, Ayling S, Baertsch R, Fernandez-Banet J, Cherry JL, Curwen V, Dicuccio M, Kellis M, Lee J, Lin MF, Schuster M, Shkeda A, Amid C, Brown G, Dukhanina O, Frankish A, Hart J, Maidak BL, Mudge J, Murphy MR, Murphy T, Rajan J, Rajput B, Riddick LD, Snow C, Steward C, Webb D, Weber JA, Wilming L, Wu W, Birney E, Haussler D, Hubbard T, Ostell J, Durbin R, Lipman D (2009).
651:
corresponding protein molecules remain unknown. However, the definition of a read-through gene in the CCDS data set is that the individual partner genes must be distinct, and the read-through transcripts must share ≥ 1 exon (or ≥ 2 splice sites except in the case of a shared terminal exon) with each of the distinct shorter loci. Transcripts are not considered to be read-through transcripts in the following circumstances:
717:. The chromosome location table includes the genomic coordinates for each individual exon of the specific coding sequence. This table also provides links to several different genome browsers, which allow you to visualise the structure of the coding region. Exact nucleotide sequence and protein sequence of the specific coding sequence are also displayed in the section of CCDS sequence data.
369:
The CCDS database operates an internal website that serves multiple purposes including curator communication, collaborator voting, providing special reports and tracking the status of CCDS representations. When a collaborating CCDS group member identifies a CCDS ID that may need review, a voting process is employed to decide on the final outcome.
1954:
Harrow, J.; Frankish, A.; Gonzalez, J. M.; Tapanari, E.; Diekhans, M.; Kokocinski, F.; Aken, B. L.; Barrell, D.; Zadissa, A.; Searle, S.; Barnes, I.; Bignell, A.; Boychenko, V.; Hunt, T.; Kay, M.; Mukherjee, G.; Rajan, J.; Despacio-Reyes, G.; Saunders, G.; Steward, C.; Harte, R.; Lin, M.; Howald, C.;
368:
The CCDS database is unique in that the review process must be carried out by multiple collaborators, and agreement must be reached before any changes can be made. This is made possible with a collaborator coordination system that includes a work process flow and forums for analysis and discussion.
1362:
The CCDS set will become more complete as the independent curation groups agree on cases where they initially differ, as additional experimental validation of weakly supported genes occurs, and as automatic annotation methods continue to improve. Communication among the CCDS collaborating groups is
389:
and HAVANA annotation guidelines and thus, new annotations provided by both groups are more likely to be concordant and result in addition of a CCDS ID. These standards address specific problem areas, are not a comprehensive set of annotation guidelines, and do not restrict the annotation policies
125:
is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies. The CCDS project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures
534:
must be annotated except when there is experimental evidence that an internal start site is used to initiate translation. Additionally, other types of new data, such as ribosome profiling data, can be used to identify start codons. The CCDS data set records one translation initiation site per CCDS
1552:
Farrell, CM; O'Leary, NA; Harte, RA; Loveland, JE; Wilming, LG; Wallin, C; Diehans, M; Barrell, D; Searle, SM; Aken, B; Hiatt, SM; Frankish, A; Suner, MM; Rajput, B; Steward, CA; Brown, GR; Bennet, R; Murphy, M; Wu, W; Kay, MP; Hart, J; Rajan, J; Weber, J; Snow, C; Riddick, LD; Hunt, T; Webb, D;
377:
Coordinated manual curation is supported by a restricted-access website and a discussion e-mail list. CCDS curation guidelines were established to address specific conflicts that were observed at a higher frequency. Establishment of CCDS curation guidelines has helped to make the CCDS curation
154:
Biological and biomedical research has come to rely on accurate and consistent annotation of genes and their products on genome assemblies. Reference annotations of genomes are available from various sources, each with their own independent goals and policies, which results in some annotation
253:
In order to ensure that CDSs are of high quality, multiple quality assurance (QA) tests are performed (Table 1). All tests are performed following the annotation comparison step of each CCDS build and are independent of individual annotation group QA tests performed prior to the annotation
650:
or co-transcribed genes. Read-through transcripts are defined as transcripts combining at least part of one exon from each of two or more distinct known (partner) genes which lie on the same chromosome in the same orientation. The biological function of read-through transcripts and their
1955:
Tanzer, A.; Derrien, T.; Chrast, J.; Walters, N.; Balasubramanian, S.; Pei, B.; Tress, M.; Rodriguez, J. M.; Ezkurdia, I.; van Baren, J.; Brent, M.; Haussler, D.; Kellis, M.; Valencia, A.; Reymond, A.; Gerstein, M.; Guigo, R.; Hubbard, T. J. (5 September 2012).
237:"Consensus" is defined as protein-coding regions that agree at the start codon, stop codon, and splice junctions, and for which the prediction meets quality assurance benchmarks. A combination of manual and automated genome annotations provided by
721:
1358:
that have the same CCDS ID. It is also anticipated that as more complete and high-quality genome sequence data become available for other organisms, annotations from these organisms may be in scope for CCDS representation.
771:
The CCDS data set size has continued to increase with both the computational genome annotation updates, which integrate new data sets submitted to the
International Nucleotide Sequence Database Collaboration
162:
assemblies by the participating annotation groups. The CCDS gene sets that have been arrived at by consensus of the different partners now consist of over 18,000 human and over 20,000 mouse genes (see
1553:
Thomas, M; Tamez, P; Rangwala, SH; McGarvey, KM; Pujar, S; Shkeda, A; Mudge, JM; Gonzale, JM; Gilbert, JG; Trevaion, SJ; Baetsch, R; Harrow, JL; Hubbard, T; Ostell, JM; Haussler, D; Pruitt, KD (2014).
1843:
Prakash, Tulika; Sharma, Vineet K.; Adati, Naoki; Ozawa, Ritsuko; Kumar, Naveen; Nishida, Yuichiro; Fujikake, Takayoshi; Takeda, Tadayuki; Taylor, Todd D.; Michalak, Pawel (12 October 2010).
378:
process more efficient by reducing the number of conflicting votes and time spent in discussion to reach a consensus agreement. A link to the CCDS curation guidelines can be found
689:. Once these quality problems are identified, the CCDS collaborators report the issues to the Genome Reference Consortium, which investigates and makes the necessary corrections.
1363:
ongoing and will resolve differences and identify refinements between CCDS update cycles. Human updates are expected to occur roughly every 6 months and mouse releases yearly.
553:
are another challenge for the CCDS data set. The scanning mechanism for translation initiation suggests that small ribosomal subunits (40S) bind at the 5’ end of a nascent
2078:
503:
According to the scanning mechanism, the small ribosomal subunit can initiate translation from the first reached start codon. There are exceptions to the scanning model:
2088:
557:
transcript and scan for the first AUG start codon. It is possible that an uAUG is recognised first, and the corresponding uORF is then translated. The translated u
739:
gene annotation project and it is used as a standard for high-quality coding exon definition in various research fields, including clinical studies, large-scale
360:
Annotations that fail QA tests undergo a round of manual checking that may improve results or reach a decision to reject annotation matches based on QA failure.
1492:
Harte, RA; Farrell, CM; Loveland, JE; Suner, MM; Wilming, L; Aken, B; Barrell, D; Frankish, A; Wallin, C; Searle, S; Diekhans, M; Harrow, J; Pruitt, KD (2012).
238:
184:
127:
47:
677:
sequences become another challenge. Quality problems occur when the reference genome is misassembled. Thereby the misassembled genome may contain premature
1796:"The canonical UPF1-dependent nonsense-mediated mRNA decay is inhibited in transcripts carrying a short open reading frame independent of sequence context"
401:
Conflicting opinions are addressed by consulting with scientific experts or other annotation curation groups such as the HUGO Gene
Nomenclature Committee
158:
The CCDS project was established to identify a gold standard set of protein-coding gene annotations that are identically annotated on the human and mouse
776:), and on ongoing curation activities that supplement or improve upon that annotation. Table 2 summarises the key statistics for each CCDS build where
706:
390:
of any collaborating group. Examples include, standardized curation guidelines for selection of the initiation codon and interpretation of upstream
451:
candidate. The CCDS collaborators use a conservative method, based on the EJC model, to screen mRNA transcripts. Any transcripts determined to be
662:
when transcripts are translated from genes that have nested structures relative to each other. In this instance, the CCDS collaborators and the
2093:
55:
447:(EJC) model. In this model, if the stop codon is >50 nt upstream of the last exon-exon junction, the transcript is assumed to be a
1695:
190:
51:
1339:
541:
AUG initiation codons located within transcript leaders are known as upstream AUGs (uAUGs). Sometimes, uAUGs are associated with u
673:
As the CCDS data set is built to represent genomic annotations of human and mouse, the quality problems with the human and mouse
663:
402:
202:
409:. If a conflict cannot be resolved, then collaborators agree to withdraw the CCDS ID until more information becomes available.
226:
196:
59:
1350:
Long-term goals include the addition of attributes that indicate where transcript annotation is also identical (including the
398:. Curation occurs continuously, and any of the collaborating centers can flag a CCDS ID as a potential update or withdrawal.
752:
1430:"The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes"
2073:
1733:
245:(which incorporates manual HAVANA annotations) are compared to identify annotations with matching genomic coordinates.
2083:
780:
are all those that were not under review or pending an update or withdrawal at the time of the current release date.
747:
projects and exon array design. Due to the consensus annotation of CCDS exons by the independent annotation groups,
601:
transcript before it reaches the protein-coding regions. Currently, no studies have reported the global impact of u
507:
when the initiation site is not surrounded by a strong Kozak signal, which results in leaky scanning. Thereby, the
1734:"Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans"
2004:
Parla, Jennifer S; Iossifov, Ivan; Grabill, Ian; Spector, Mona S; Kramer, Melissa; McCombie, W Richard (2011).
1382:
406:
208:
751:
projects in particular have regarded CCDS coding exons as reliable targets for downstream studies (e.g., for
2098:
578:
570:
562:
483:
475:
467:
460:
452:
448:
440:
428:
420:
395:
439:
is translated, the truncated protein may cause disease. Different mechanisms have been proposed to explain
444:
1794:
Silva, AL; Pereira, FJC; Morgado, A; Kong, J; Martins, R; Faustino, P; Liebhaber, SA; Romao, L (2006).
535:
ID. Any alternative start sites may be used for translation and will be stated in a CCDS public note.
379:
1856:
1748:
168:
1355:
1351:
682:
135:
258:
Table 1: Examples of the types of CCDS QA tests performed prior to acceptance of CCDS candidates
705:
or re-query the CCDS data set. The sequence identifiers table presents transcript information in
636:
613:
602:
586:
574:
566:
558:
550:
546:
542:
531:
523:
515:
497:
391:
2037:
1986:
1936:
1884:
1825:
1776:
1711:
1691:
1656:
1584:
1523:
1459:
656:
2058:
2027:
2017:
1976:
1968:
1926:
1918:
1902:
1874:
1864:
1815:
1807:
1766:
1756:
1701:
1683:
1646:
1638:
1574:
1566:
1513:
1505:
1449:
1441:
1425:
674:
338:
Checks if the protein encoded by the NCBI RefSeq is the same length as the EBI/WTSI protein
159:
126:
that they are consistently represented by the
National Center for Biotechnology Information
100:
714:
306:
Checks for transcripts or proteins that are unusually short, typically <100 amino acids
1338:
The complete set of release statistics can be found at the official CCDS website on their
42:
725:
Figure 1. The CCDS data set screenshot showing the report for Itm2a protein (CCDS 30349).
466:
there is experimental evidence suggesting that a functional protein is produced from the
1860:
1752:
2032:
2005:
1981:
1956:
1931:
1906:
1879:
1844:
1820:
1795:
1771:
1706:
1675:
1651:
1626:
1579:
1554:
1518:
1493:
1454:
1429:
647:
1642:
2067:
1421:
756:
628:
621:
609:
598:
554:
436:
435:
before it can be translated into protein. This is important because if the defective
432:
424:
69:
1845:"Expression of Conjoined Genes: Another Mechanism for Gene Regulation in Eukaryotes"
1377:
455:
candidates are excluded from the CCDS data set except in the following situations:
385:
Curation policies established for the CCDS data set have been integrated in to the
1687:
490:
group and the HAVANA project have subsequently revised their annotation policies.
1869:
1494:"Tracking and coordinating an international curation effort for the CCDS project"
549:
are found in approximately 50% of human and mouse transcripts. The existence of u
740:
666:
have agreed that the read-through transcript be represented as a separate locus.
478:
candidate transcripts were considered to be protein coding transcripts by both
314:
Checks for genes that are not conserved and/or are not in a HomoloGene cluster
1676:"Genome-wide Annotation and Quantitation of Translation by Ribosome Profiling"
1509:
686:
678:
632:
582:
2022:
463:
candidates however the locus is previously known to be protein coding region;
1761:
1627:"Pushing the limits of the scanning mechanism for initiation of translation"
274:
Checks for transcripts that may be subject to nonsense-mediated decay (NMD)
2041:
1990:
1940:
1888:
1829:
1780:
1715:
1660:
1588:
1555:"Current status and new features of the Consensus Coding Sequence database"
1527:
1463:
346:
Checks for >99% overall identity between the NCBI and EBI/WTSI proteins
1972:
1922:
1570:
1445:
720:
330:
Checks for the presence of an internal stop codon in the genomic sequence
1605:
Alberts, B; Johnson, A; Lewis, J; Raff, M; Roberts, K; Walter, P (2002).
594:
590:
519:
508:
496:
Multiple factors contribute to translation initiation, such as upstream
1957:"GENCODE: The reference human genome annotation for The ENCODE Project"
1811:
1392:
1372:
736:
710:
242:
131:
1674:
Ingolia, NT; Brar, GA; Rouskin, S; McGeachy, AM; Weissman, JS (2014).
511:
skips this AUG and initiates translation from a downstream start site;
1387:
702:
698:
487:
479:
386:
219:
138:. The integrity of the CCDS dataset is maintained through stringent
760:
748:
744:
719:
322:
Checks for a start or stop codon in the reference genome sequence
486:
candidate transcripts were represented in the CCDS data set. The
697:
The CCDS project is available from the NCBI CCDS data set page
1905:; Ostell, J.; Pruitt, K. D.; Tatusova, T. (28 November 2010).
608:
The current CCDS annotation guidelines allow the inclusion of
298:
Checks for genes that are predicted to be pseudogenes by UCSC
459:
all transcripts at one particular locus are assessed to be
589:
inhibit translation of the downstream gene by trapping a
530:
According to the CCDS annotation guidelines, the longest
225:
Human and
Vertebrate Analysis and Annotation (HAVANA) at
101:
https://www.ncbi.nlm.nih.gov/projects/CCDS/CcdsBrowse.cgi
773:
616:
if they meet the following two biological requirements:
394:
and transcripts that are predicted to be candidates for
167:). The CCDS dataset is increasingly representing more
32:
Convergence towards a standard set of gene annotations
784:
Table 2. Summary statistics for past CCDS releases.
111:
106:
96:
91:
83:
75:
65:
41:
36:
28:
23:
565:candidate, although studies have shown that some u
1354:) and to indicate splice variants with different
1907:"Entrez Gene: gene-centered information at NCBI"
1416:
1414:
1412:
1410:
1408:
755:detection), and these exons have been used as
183:National Center for Biotechnology Information
139:
1732:Calvo, SE; Pagliarni, DJ; Mootha, VK (2009).
413:Curation challenges and annotation guidelines
48:National Center for Biotechnology Information
8:
1424:, Harrow J, Harte RA, Wallin C, Diekhans M,
735:The CCDS dataset is an integral part of the
18:
646:Read-through transcripts are also known as
522:to re-initiate translation at a downstream
494:Multiple in-frame translation start sites:
17:
2079:Genetic engineering in the United Kingdom
2031:
2021:
2006:"A comparative analysis of exome capture"
1980:
1930:
1878:
1868:
1819:
1770:
1760:
1705:
1650:
1620:
1618:
1616:
1578:
1517:
1453:
179:Participating annotation groups include:
164:
2089:Science and technology in Cambridgeshire
1600:
1598:
1487:
1485:
1483:
1481:
1479:
1477:
1475:
1473:
782:
354:Checks if the GeneID is no longer valid
256:
123:Consensus Coding Sequence (CCDS) Project
1727:
1725:
1547:
1545:
1543:
1541:
1539:
1537:
1404:
803:
800:
797:
794:
791:
265:
262:
788:
290:Checks for non-canonical splice sites
1607:Molecular Biology of the Cell 5th edn
671:Quality of reference genome sequence:
624:transcript has a strong Kozak signal;
335:NCBI:Ensembl protein length different
7:
143:
56:University of California, Santa Cruz
659:but do not share same splice sites;
655:when transcripts are produced from
593:initiation complex and causing the
585:. It also has been suggested that u
759:targets in commercially available
319:CDS start or stop not in alignment
214:Manual annotation is provided by:
189:European Bioinformatics Institute
14:
343:NCBI:Ensembl low percent identity
282:Checks for low coding propensity
201:HUGO Gene Nomenclature Committee
52:European Bioinformatics Institute
311:Ortholog not found/not conserved
195:Wellcome Trust Sanger Institute
482:and HAVANA, and thereby, these
60:Wellcome Trust Sanger Institute
573:. The average size limit for u
418:Nonsense-mediated decay (NMD):
171:events with each new release.
1:
2094:South Cambridgeshire District
1688:10.1002/0471142727.mb0418s103
1682:. Chapter 4: 4.18.1–4.18.19.
1643:10.1016/S0378-1119(02)01056-9
635:or overlaps with the primary
605:on translational regulation.
539:Upstream open reading frames:
405:and Mouse Genome Informatics
1870:10.1371/journal.pone.0013284
1741:Proc. Natl. Acad. Sci. U.S.A
1609:. New York: Garland Science.
2115:
631:transcript is either ≥ 35
287:Non-consensus splice sites
233:Defining the CCDS gene set
1340:Releases & Statistics
753:single nucleotide variant
644:Read-through transcripts:
249:Quality assurance testing
207:Mouse Genome Informatics
150:Motivation and background
140:quality assurance testing
2023:10.1186/gb-2011-12-9-r97
1383:Mouse Genome Informatics
685:, or likely polymorphic
612:transcripts containing u
1762:10.1073/pnas.0810916106
1680:Curr. Protoc. Mol. Biol
1510:10.1093/database/bas008
597:to dissociate from the
396:nonsense-mediated decay
79:Pruitt KD, et al (2009)
727:
427:surveillance process.
1973:10.1101/gr.135350.111
1917:(Database): D52–D57.
1446:10.1101/gr.080531.108
804:Current release date
798:Public CCDS ID count
723:
470:candidate transcript.
445:exon junction complex
431:eliminates defective
423:is the most powerful
2074:Biological databases
767:CCDS release history
731:Current applications
581:is approximately 35
295:Predicted pseudogene
266:Purpose of the test
218:Reference Sequence (
169:alternative splicing
165:CCDS release history
1923:10.1093/nar/gkq1237
1861:2010PLoSO...513284P
1753:2009PNAS..106.7507C
1571:10.1093/nar/gkt1059
785:
693:Access to CCDS data
498:open reading frames
259:
175:Contributing groups
136:UCSC Genome Browser
20:
2084:Genetics databases
1812:10.1261/rna.201406
783:
728:
683:frame-shift indels
637:open reading frame
257:
1911:Nucleic Acids Res
1625:Kozak, M (2002).
1565:(D1): D865–D872.
1559:Nucleic Acids Res
1336:
1335:
657:overlapping genes
577:that will escape
358:
357:
351:Gene discontinued
119:
118:
2106:
2046:
2045:
2035:
2025:
2001:
1995:
1994:
1984:
1967:(9): 1760–1774.
1951:
1945:
1944:
1934:
1899:
1893:
1892:
1882:
1872:
1840:
1834:
1833:
1823:
1791:
1785:
1784:
1774:
1764:
1738:
1729:
1720:
1719:
1709:
1671:
1665:
1664:
1654:
1622:
1611:
1610:
1602:
1593:
1592:
1582:
1549:
1532:
1531:
1521:
1489:
1468:
1467:
1457:
1418:
1346:Future prospects
786:
675:reference genome
443:; one being the
260:
160:reference genome
76:Primary citation
21:
2114:
2113:
2109:
2108:
2107:
2105:
2104:
2103:
2064:
2063:
2055:
2050:
2049:
2003:
2002:
1998:
1953:
1952:
1948:
1901:
1900:
1896:
1842:
1841:
1837:
1806:(12): 2160–70.
1793:
1792:
1788:
1747:(18): 7507–12.
1736:
1731:
1730:
1723:
1698:
1673:
1672:
1668:
1624:
1623:
1614:
1604:
1603:
1596:
1551:
1550:
1535:
1491:
1490:
1471:
1420:
1419:
1406:
1401:
1369:
1348:
778:Public CCDS IDs
769:
733:
695:
648:conjoined genes
514:when a shorter
415:
375:
373:Manual curation
366:
251:
235:
177:
152:
144:manual curation
115:CCDS Release 24
58:
54:
50:
43:Research center
12:
11:
5:
2112:
2110:
2102:
2101:
2099:Wellcome Trust
2096:
2091:
2086:
2081:
2076:
2066:
2065:
2062:
2061:
2059:CCDS home page
2054:
2053:External links
2051:
2048:
2047:
1996:
1946:
1894:
1855:(10): e13284.
1835:
1786:
1721:
1696:
1666:
1612:
1594:
1533:
1469:
1440:(7): 1316–23.
1403:
1402:
1400:
1397:
1396:
1395:
1390:
1385:
1380:
1375:
1368:
1365:
1347:
1344:
1334:
1333:
1330:
1327:
1324:
1321:
1316:
1312:
1311:
1308:
1305:
1302:
1299:
1294:
1290:
1289:
1286:
1283:
1280:
1277:
1272:
1268:
1267:
1264:
1261:
1258:
1255:
1250:
1246:
1245:
1242:
1239:
1236:
1233:
1228:
1224:
1223:
1222:July 30, 2015
1220:
1217:
1214:
1211:
1206:
1202:
1201:
1198:
1195:
1192:
1189:
1184:
1180:
1179:
1176:
1173:
1170:
1167:
1162:
1158:
1157:
1154:
1151:
1148:
1145:
1140:
1136:
1135:
1132:
1129:
1126:
1123:
1118:
1114:
1113:
1110:
1107:
1104:
1101:
1096:
1092:
1091:
1088:
1085:
1082:
1079:
1074:
1070:
1069:
1066:
1063:
1060:
1057:
1052:
1048:
1047:
1044:
1041:
1038:
1035:
1030:
1026:
1025:
1022:
1019:
1016:
1013:
1008:
1004:
1003:
1000:
997:
994:
991:
986:
982:
981:
978:
975:
972:
969:
964:
960:
959:
956:
953:
950:
947:
942:
938:
937:
934:
931:
928:
925:
920:
916:
915:
912:
909:
906:
903:
898:
894:
893:
890:
887:
884:
881:
876:
872:
871:
868:
865:
862:
859:
854:
850:
849:
846:
843:
840:
837:
832:
828:
827:
824:
821:
818:
815:
810:
806:
805:
802:
801:Gene ID count
799:
796:
795:Assembly name
793:
790:
768:
765:
732:
729:
694:
691:
668:
667:
660:
641:
640:
625:
528:
527:
518:can allow the
512:
472:
471:
464:
414:
411:
374:
371:
365:
364:Review process
362:
356:
355:
352:
348:
347:
344:
340:
339:
336:
332:
331:
328:
324:
323:
320:
316:
315:
312:
308:
307:
304:
300:
299:
296:
292:
291:
288:
284:
283:
280:
276:
275:
272:
271:Subject to NMD
268:
267:
264:
250:
247:
234:
231:
230:
229:
223:
212:
211:
205:
199:
193:
187:
176:
173:
151:
148:
117:
116:
113:
109:
108:
104:
103:
98:
94:
93:
89:
88:
85:
81:
80:
77:
73:
72:
67:
63:
62:
45:
39:
38:
34:
33:
30:
26:
25:
13:
10:
9:
6:
4:
3:
2:
2111:
2100:
2097:
2095:
2092:
2090:
2087:
2085:
2082:
2080:
2077:
2075:
2072:
2071:
2069:
2060:
2057:
2056:
2052:
2043:
2039:
2034:
2029:
2024:
2019:
2015:
2011:
2007:
2000:
1997:
1992:
1988:
1983:
1978:
1974:
1970:
1966:
1962:
1958:
1950:
1947:
1942:
1938:
1933:
1928:
1924:
1920:
1916:
1912:
1908:
1904:
1898:
1895:
1890:
1886:
1881:
1876:
1871:
1866:
1862:
1858:
1854:
1850:
1846:
1839:
1836:
1831:
1827:
1822:
1817:
1813:
1809:
1805:
1801:
1797:
1790:
1787:
1782:
1778:
1773:
1768:
1763:
1758:
1754:
1750:
1746:
1742:
1735:
1728:
1726:
1722:
1717:
1713:
1708:
1703:
1699:
1697:9780471142720
1693:
1689:
1685:
1681:
1677:
1670:
1667:
1662:
1658:
1653:
1648:
1644:
1640:
1637:(1–2): 1–34.
1636:
1632:
1628:
1621:
1619:
1617:
1613:
1608:
1601:
1599:
1595:
1590:
1586:
1581:
1576:
1572:
1568:
1564:
1560:
1556:
1548:
1546:
1544:
1542:
1540:
1538:
1534:
1529:
1525:
1520:
1515:
1511:
1507:
1503:
1499:
1495:
1488:
1486:
1484:
1482:
1480:
1478:
1476:
1474:
1470:
1465:
1461:
1456:
1451:
1447:
1443:
1439:
1435:
1431:
1427:
1423:
1417:
1415:
1413:
1411:
1409:
1405:
1398:
1394:
1391:
1389:
1386:
1384:
1381:
1379:
1376:
1374:
1371:
1370:
1366:
1364:
1360:
1357:
1353:
1345:
1343:
1341:
1332:Oct 26, 2022
1331:
1328:
1325:
1322:
1320:
1317:
1314:
1313:
1310:Oct 24, 2019
1309:
1306:
1303:
1300:
1298:
1295:
1292:
1291:
1288:Jun 14, 2018
1287:
1284:
1281:
1278:
1276:
1273:
1270:
1269:
1265:
1262:
1259:
1256:
1254:
1251:
1248:
1247:
1243:
1240:
1237:
1234:
1232:
1229:
1226:
1225:
1221:
1218:
1215:
1212:
1210:
1207:
1204:
1203:
1200:May 12, 2015
1199:
1196:
1193:
1190:
1188:
1185:
1182:
1181:
1178:Sep 10, 2014
1177:
1174:
1171:
1168:
1166:
1163:
1160:
1159:
1156:Sep 10, 2014
1155:
1152:
1149:
1146:
1144:
1141:
1138:
1137:
1133:
1130:
1127:
1124:
1122:
1119:
1116:
1115:
1112:Nov 29, 2013
1111:
1108:
1105:
1102:
1100:
1097:
1094:
1093:
1089:
1086:
1083:
1080:
1078:
1075:
1072:
1071:
1068:Oct 24, 2013
1067:
1064:
1061:
1058:
1056:
1053:
1050:
1049:
1046:Apr 29, 2013
1045:
1042:
1039:
1036:
1034:
1031:
1028:
1027:
1023:
1020:
1017:
1014:
1012:
1009:
1006:
1005:
1002:Oct 25, 2012
1001:
998:
995:
992:
990:
987:
984:
983:
979:
976:
973:
970:
968:
965:
962:
961:
958:Aug 14, 2012
957:
954:
951:
948:
946:
943:
940:
939:
936:Apr 20, 2011
935:
932:
929:
926:
924:
921:
918:
917:
913:
910:
907:
904:
902:
899:
896:
895:
892:Jan 24, 2011
891:
888:
885:
882:
880:
877:
874:
873:
869:
866:
863:
860:
858:
855:
852:
851:
848:Nov 28, 2007
847:
844:
841:
838:
836:
833:
830:
829:
826:Mar 14, 2007
825:
822:
819:
816:
814:
811:
808:
807:
787:
781:
779:
775:
766:
764:
762:
758:
757:coding region
754:
750:
746:
742:
738:
730:
726:
722:
718:
716:
712:
708:
704:
700:
692:
690:
688:
684:
680:
676:
672:
665:
661:
658:
654:
653:
652:
649:
645:
638:
634:
630:
626:
623:
619:
618:
617:
615:
611:
606:
604:
600:
596:
592:
588:
584:
580:
576:
572:
568:
564:
560:
556:
552:
548:
544:
540:
536:
533:
525:
521:
517:
513:
510:
506:
505:
504:
501:
499:
495:
491:
489:
485:
481:
477:
469:
465:
462:
458:
457:
456:
454:
450:
446:
442:
438:
434:
430:
426:
422:
419:
412:
410:
408:
404:
399:
397:
393:
388:
383:
381:
372:
370:
363:
361:
353:
350:
349:
345:
342:
341:
337:
334:
333:
329:
327:Internal stop
326:
325:
321:
318:
317:
313:
310:
309:
305:
302:
301:
297:
294:
293:
289:
286:
285:
281:
278:
277:
273:
270:
269:
261:
255:
248:
246:
244:
240:
232:
228:
224:
221:
217:
216:
215:
210:
206:
204:
200:
198:
194:
192:
188:
186:
182:
181:
180:
174:
172:
170:
166:
161:
156:
149:
147:
145:
142:and on-going
141:
137:
133:
129:
124:
114:
110:
107:Miscellaneous
105:
102:
99:
95:
90:
86:
82:
78:
74:
71:
70:Kim D. Pruitt
68:
64:
61:
57:
53:
49:
46:
44:
40:
35:
31:
27:
22:
16:
2013:
2009:
1999:
1964:
1960:
1949:
1914:
1910:
1897:
1852:
1848:
1838:
1803:
1799:
1789:
1744:
1740:
1679:
1669:
1634:
1630:
1606:
1562:
1558:
1501:
1497:
1437:
1433:
1378:Human Genome
1361:
1349:
1337:
1319:Homo sapiens
1318:
1297:Mus musculus
1296:
1275:Homo sapiens
1274:
1266:Dec 8, 2016
1253:Mus musculus
1252:
1244:Sep 8, 2016
1231:Homo sapiens
1230:
1209:Mus musculus
1208:
1187:Homo sapiens
1186:
1165:Homo sapiens
1164:
1143:Mus musculus
1142:
1134:Aug 7, 2014
1121:Homo sapiens
1120:
1099:Homo sapiens
1098:
1090:Apr 7, 2014
1077:Mus musculus
1076:
1055:Homo sapiens
1054:
1033:Homo sapiens
1032:
1024:Aug 5, 2013
1011:Mus musculus
1010:
989:Homo sapiens
988:
980:Sep 6, 2011
967:Homo sapiens
966:
945:Mus musculus
944:
923:Homo sapiens
922:
914:Sep 2, 2009
901:Homo sapiens
900:
879:Mus musculus
878:
870:May 1, 2008
857:Homo sapiens
856:
835:Mus musculus
834:
813:Homo sapiens
812:
777:
770:
734:
724:
696:
670:
669:
643:
642:
607:
538:
537:
529:
502:
493:
492:
474:Previously,
473:
417:
416:
400:
384:
376:
367:
359:
254:comparison.
252:
236:
213:
178:
157:
153:
122:
120:
84:Release date
19:CCDS Project
15:
2010:Genome Biol
1903:Maglott, D.
1323:GRCh38.p14
1279:GRCh38.p12
703:Entrez Gene
687:pseudogenes
679:stop codons
633:amino acids
583:amino acids
561:could be a
279:Low quality
155:variation.
29:Description
2068:Categories
2016:(9): R97.
1961:Genome Res
1504:: bas008.
1434:Genome Res
1426:Maglott DR
1399:References
1301:GRCm38.p6
1125:GRCh37.p13
1103:GRCh37.p13
1059:GRCh37.p10
741:epigenomic
569:can avoid
1422:Pruitt KD
1257:GRCm38.p4
1235:GRCh38.p7
1213:GRCm38.p3
1191:GRCh38.p2
1147:GRCm38.p2
1081:GRCm38.p1
1037:GRCh37.p9
993:GRCh37.p5
971:GRCh37.p2
743:studies,
303:Too short
222:) at NCBI
2042:21958622
1991:22955987
1941:21115458
1889:20967262
1849:PLOS ONE
1830:17077274
1781:19372376
1716:23821443
1661:12459250
1589:24217909
1528:22434842
1498:Database
1464:19498102
1367:See also
792:Species
789:Release
595:ribosome
591:ribosome
520:ribosome
509:ribosome
263:QA test
2033:3308060
1982:3431492
1932:3013746
1880:2953495
1857:Bibcode
1821:1664719
1772:2669787
1749:Bibcode
1707:3775365
1652:7126118
1580:3965069
1519:3308164
1455:2704439
1393:Ensembl
1373:GENCODE
1329:19,107
1326:35,608
1307:20,486
1304:27,219
1285:19,033
1282:33,397
949:MGSCv37
886:17, 082
883:MGSCv37
839:MGSCv36
737:GENCODE
711:Ensembl
243:Ensembl
132:Ensembl
112:Version
97:Website
66:Authors
37:Contact
24:Content
2040:
2030:
1989:
1979:
1939:
1929:
1887:
1877:
1828:
1818:
1779:
1769:
1714:
1704:
1694:
1659:
1649:
1587:
1577:
1526:
1516:
1462:
1452:
1388:RefSeq
1342:page.
1263:20,354
1260:25,757
1241:18,892
1238:32,524
1219:20,215
1216:24,834
1197:18,826
1194:31,371
1175:18,800
1172:30,461
1169:GRCh38
1153:20,079
1150:23,835
1131:18,681
1128:28,897
1109:18,673
1106:28,649
1087:19,990
1084:23,010
1065:18,607
1062:27,655
1043:18,535
1040:27,377
1021:19,945
1018:22,934
1015:GRCm38
999:18,474
996:26,254
977:18,407
974:25,354
955:19,507
952:21,874
933:18,174
930:22,912
927:GRCh37
911:17,053
908:19,393
905:NCBI36
889:16,888
867:15,805
864:17,494
861:NCBI36
845:13,012
842:13,218
823:12,950
820:13,740
817:NCBI35
774:(INSDC
763:kits.
699:(here)
488:RefSeq
480:RefSeq
403:(HGNC)
387:RefSeq
239:(NCBI)
220:RefSeq
203:(HGNC)
197:(WTSI)
185:(NCBI)
134:, and
128:(NCBI)
92:Access
1737:(PDF)
761:exome
749:exome
745:exome
715:Blink
407:(MGI)
209:(MGI)
191:(EBI)
2038:PMID
1987:PMID
1937:PMID
1885:PMID
1826:PMID
1777:PMID
1712:PMID
1692:ISBN
1657:PMID
1631:Gene
1585:PMID
1524:PMID
1502:2012
1460:PMID
1356:UTRs
1352:UTRs
713:and
707:VEGA
664:HGNC
629:mRNA
627:the
622:mRNA
620:the
614:ORFs
610:mRNA
603:ORFs
599:mRNA
587:ORFs
575:ORFs
567:ORFs
555:mRNA
551:ORFs
547:ORFs
543:ORFs
437:mRNA
433:mRNA
425:mRNA
392:ORFs
380:here
241:and
227:WTSI
121:The
87:2009
2028:PMC
2018:doi
1977:PMC
1969:doi
1927:PMC
1919:doi
1875:PMC
1865:doi
1816:PMC
1808:doi
1800:RNA
1767:PMC
1757:doi
1745:106
1702:PMC
1684:doi
1647:PMC
1639:doi
1635:299
1575:PMC
1567:doi
1514:PMC
1506:doi
1450:PMC
1442:doi
1315:24
1293:23
1271:22
579:NMD
571:NMD
563:NMD
559:ORF
545:. u
532:ORF
524:ORF
516:ORF
484:NMD
476:NMD
468:NMD
461:NMD
453:NMD
449:NMD
441:NMD
429:NMD
421:NMD
2070::
2036:.
2026:.
2014:12
2012:.
2008:.
1985:.
1975:.
1965:22
1963:.
1959:.
1935:.
1925:.
1915:39
1913:.
1909:.
1883:.
1873:.
1863:.
1851:.
1847:.
1824:.
1814:.
1804:12
1802:.
1798:.
1775:.
1765:.
1755:.
1743:.
1739:.
1724:^
1710:.
1700:.
1690:.
1678:.
1655:.
1645:.
1633:.
1629:.
1615:^
1597:^
1583:.
1573:.
1563:42
1561:.
1557:.
1536:^
1522:.
1512:.
1500:.
1496:.
1472:^
1458:.
1448:.
1438:19
1436:.
1432:.
1407:^
1249:21
1227:20
1205:19
1183:18
1161:17
1139:16
1117:15
1095:14
1073:13
1051:12
1029:11
1007:10
709:,
681:,
382:.
146:.
130:,
2044:.
2020::
1993:.
1971::
1943:.
1921::
1891:.
1867::
1859::
1853:5
1832:.
1810::
1783:.
1759::
1751::
1718:.
1686::
1663:.
1641::
1591:.
1569::
1530:.
1508::
1466:.
1444::
985:9
963:8
941:7
919:6
897:5
875:4
853:3
831:2
809:1
639:.
526:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.