Knowledge (XXG)

Contig

Source đź“ť

232:(BAC) library screened has low complexity, meaning it does not contain a high number of STS or restriction sites, or if certain regions were less stable in cloning hosts and thus underrepresented in the library. If gaps between contigs remain after STS landmark mapping and restriction fingerprinting have been performed, the sequencing of contig ends can be used to close these gaps. This end-sequencing strategy essentially creates a novel STS with which to screen the other contigs. Alternatively, the end sequence of a contig can be used as a primer to 196:. Since these clones should cover the entire genome/chromosome, it is theoretically possible to assemble a contig of BACs that covers the entire chromosome. Reality, however, is not always ideal. Gaps often remain, and a scaffold—consisting of contigs and gaps—that covers the map region is often the first result. The gaps between contigs can be closed by various methods outlined below. 220:. If two clones, they will likely have restriction sites in common, and will thus share several fragments. Because the number of fragments in common and the length of these fragments is known (the length is judged by comparison to a size standard), the degree of overlap can be deduced to a high degree of precision. 181:
the minimum number of clones that form a contig that covers the entire chromosome comprise the tiling path that is used for sequencing. Once a tiling path has been selected, its component BACs are sheared into smaller fragments and sequenced. Contigs therefore provide the framework for hierarchical sequencing.
208:(STS) content mapping to detect unique DNA sites in common between BACs. The degree of overlap is roughly estimated by the number of STS markers in common between two clones, with more markers in common signifying a greater overlap. Because this strategy provides only a very rough estimate of overlap, 108:
then searches this database for pairs of overlapping reads. Assembling the reads from such a pair (including, of course, only one copy of the identical sequence) produces a longer contiguous read (contig) of sequenced DNA. By repeating this process many times, at first with the initial short pairs of
180:
is made prior to sequencing in order to provide a framework to guide the later assembly of the sequence reads of the genome. This map identifies the relative positions and overlap of the clones used for sequencing. Sets of overlapping clones that form a contiguous stretch of DNA are called contigs;
128:
longer DNA fragments are sequenced. Here, a contig still refers to any contiguous stretch of sequence data created by read overlap. Because the fragments are of known length, the distance between the two end reads from each fragment is known. This gives additional information about the orientation
76:
In order to make it easier to talk about our data gained by the shotgun method of sequencing we have invented the word "contig". A contig is a set of gel readings that are related to one another by overlap of their sequences. All gel readings belong to one and only one contig, and each contig
98:
strategy involves shearing genomic DNA into many small fragments ("bottom"), sequencing these fragments, reassembling them back into contigs and eventually the entire genome ("up"). Because current technology allows for the direct sequencing of only relatively short DNA fragments (300–1000
140:
Scaffolds consist of overlapping contigs separated by gaps of known length. The new constraints placed on the orientation of the contigs allows for the placement of highly repeated sequences in the genome. If one end read has a repetitive sequence, as long as its
113: 145:
is located within a contig, its placement is known. The remaining gaps between the contigs in the scaffolds can then be sequenced by a variety of methods, including PCR amplification followed by sequencing (for smaller gaps) and
103:
DNA is sheared randomly into fragments appropriately sized for sequencing. The subsequent sequence reads, which are the data that contain the sequences of the small fragments, are put into a database. The
77:
contains at least one gel reading. The gel readings in a contig can be summed to form a contiguous consensus sequence and the length of this sequence is the length of the contig.
66:. Contigs can thus refer both to overlapping DNA sequences and to overlapping physical segments (fragments) contained in clones depending on the context. 109:
reads but then using increasingly longer pairs that are the result of previous assembly, the DNA sequence of an entire chromosome can be determined.
212:
fragment analysis, which provides a more precise measurement of clone overlap, is often used. In this strategy, clones are treated with one or two
304: 86:
A sequence contig is a continuous (not contiguous) sequence resulting from the reassembly of the small DNA fragments generated by
189: 184:
The assembly of a contig map involves several steps. First, DNA is sheared into larger (50–200kb) pieces, which are cloned into
28: 229: 185: 159: 147: 204:
BAC contigs are constructed by aligning BAC regions of known overlap via a variety of methods. One common strategy is to use
134: 99:
nucleotides), genomic DNA must be fragmented into small pieces prior to sequencing. In bottom-up sequencing projects,
534: 100: 116:
Overlapping reads from paired-end sequencing form contigs; contigs and gaps of known length form scaffolds.
121: 205: 217: 91: 213: 209: 172: 87: 55: 47: 43: 465:"Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses" 539: 494: 426: 377: 300: 193: 63: 484: 476: 416: 408: 367: 359: 329: 489: 464: 245: 233: 105: 95: 51: 421: 396: 372: 347: 528: 177: 163: 112: 59: 228:
Gaps often remain after initial BAC contig construction. These gaps occur if the
90:
strategies. This meaning of contig is consistent with the original definition by
16:
Set of overlapping DNA segments that together represent a consensus region of DNA
348:"A new computer method for the storage and manipulation of DNA gel reading data" 514: 363: 333: 142: 520:
Staden package of sequence assembly: Definitions and background information
498: 412: 129:
of contigs constructed from these reads and allows for their assembly into
480: 381: 176:
sequencing strategy is used. In this sequencing method, a low-resolution
430: 111: 519: 42:) is a set of overlapping DNA segments that together represent a 58:
projects, contig refers to the overlapping clones that form a
397:"A strategy of DNA sequencing employing computer programs" 150:
cloning methods followed by sequencing for larger gaps.
50:
projects, a contig refers to overlapping sequence data (
442: 440: 62:of the genome that is used to guide sequencing and 515:Definition of the term and historical perspective 268: 266: 264: 262: 260: 463:Fullwood MJ, Wei C, Liu ET, et al. (2009). 299:(3rd ed.). Sinauer Associates. p. 84. 8: 320: 318: 316: 290: 288: 286: 284: 282: 488: 420: 371: 216:and the resulting fragments separated by 158:Contig can also refer to the overlapping 328:. Encyclopedia of Life Sciences, 2005. 295:Gibson, Greg; Muse, Spencer V. (2009). 276:. Encyclopedia of Life Sciences, 2005. 256: 450:. Encyclopedia of Life Sciences, 2005. 458: 456: 7: 14: 29:Contig (defragmentation utility) 230:Bacterial Artificial Chromosome 124:technology where both ends of 1: 70:Original definition of contig 27:defragmentation program, see 200:Construction of BAC contigs 120:Today, it is common to use 23:in DNA sequencing. For the 556: 297:A Primer of Genome Science 18: 166:of a chromosome when the 74:In 1980, Staden wrote: 334:10.1038/npg.els.0005353 44:consensus region of DNA 401:Nucleic Acids Research 352:Nucleic Acids Research 117: 94:(1979). The bottom-up 19:This article is about 481:10.1101/gr.074906.107 364:10.1093/nar/8.16.3673 122:paired-end sequencing 115: 413:10.1093/nar/6.7.2601 224:Gaps between contigs 206:sequence-tagged site 133:in a process called 88:bottom-up sequencing 48:bottom-up sequencing 218:gel electrophoresis 214:restriction enzymes 56:top-down sequencing 346:Staden, R (1980). 210:restriction digest 126:consistently sized 118: 535:Molecular biology 448:Genome Sequencing 395:Staden R (1979). 358:(16): 3673–3694. 306:978-0-878-93236-8 106:assembly software 547: 503: 502: 492: 460: 451: 444: 435: 434: 424: 407:(7): 2601–2610. 392: 386: 385: 375: 343: 337: 322: 311: 310: 292: 277: 270: 236:across the gap. 82:Sequence contigs 555: 554: 550: 549: 548: 546: 545: 544: 525: 524: 511: 506: 469:Genome Research 462: 461: 454: 445: 438: 394: 393: 389: 345: 344: 340: 323: 314: 307: 294: 293: 280: 274:Contig Assembly 271: 258: 254: 242: 226: 202: 156: 84: 72: 32: 17: 12: 11: 5: 553: 551: 543: 542: 537: 527: 526: 523: 522: 517: 510: 509:External links 507: 505: 504: 475:(4): 521–532. 452: 436: 387: 338: 326:Genome Mapping 312: 305: 278: 255: 253: 250: 249: 248: 246:Staden Package 241: 238: 225: 222: 201: 198: 192:to form a BAC 155: 152: 96:DNA sequencing 83: 80: 71: 68: 15: 13: 10: 9: 6: 4: 3: 2: 552: 541: 538: 536: 533: 532: 530: 521: 518: 516: 513: 512: 508: 500: 496: 491: 486: 482: 478: 474: 470: 466: 459: 457: 453: 449: 443: 441: 437: 432: 428: 423: 418: 414: 410: 406: 402: 398: 391: 388: 383: 379: 374: 369: 365: 361: 357: 353: 349: 342: 339: 335: 331: 327: 324:Dear, P. H. 321: 319: 317: 313: 308: 302: 298: 291: 289: 287: 285: 283: 279: 275: 269: 267: 265: 263: 261: 257: 251: 247: 244: 243: 239: 237: 235: 231: 223: 221: 219: 215: 211: 207: 199: 197: 195: 191: 187: 182: 179: 175: 174: 169: 165: 161: 153: 151: 149: 144: 138: 136: 132: 127: 123: 114: 110: 107: 102: 97: 93: 92:Rodger Staden 89: 81: 79: 78: 69: 67: 65: 61: 57: 53: 49: 45: 41: 37: 30: 26: 22: 472: 468: 447: 446:Dunham, I. 404: 400: 390: 355: 351: 341: 325: 296: 273: 272:Gregory, S. 227: 203: 183: 173:hierarchical 171: 167: 164:physical map 162:that form a 157: 139: 130: 125: 119: 85: 75: 73: 60:physical map 39: 35: 33: 24: 20: 234:primer walk 154:BAC contigs 135:scaffolding 529:Categories 252:References 40:contiguous 143:mate pair 131:scaffolds 101:amplified 540:Genomics 499:19339662 240:See also 168:top-down 64:assembly 490:3807531 382:7433103 194:library 497:  487:  431:461197 429:  422:327874 419:  380:  373:324183 370:  303:  160:clones 54:); in 38:(from 36:contig 25:contig 21:contig 52:reads 46:. In 495:PMID 427:PMID 378:PMID 301:ISBN 190:PACs 186:BACs 485:PMC 477:doi 417:PMC 409:doi 368:PMC 360:doi 330:doi 188:or 178:map 170:or 148:BAC 531:: 493:. 483:. 473:19 471:. 467:. 455:^ 439:^ 425:. 415:. 403:. 399:. 376:. 366:. 354:. 350:. 315:^ 281:^ 259:^ 137:. 34:A 501:. 479:: 433:. 411:: 405:6 384:. 362:: 356:8 336:. 332:: 309:. 31:.

Index

Contig (defragmentation utility)
consensus region of DNA
bottom-up sequencing
reads
top-down sequencing
physical map
assembly
bottom-up sequencing
Rodger Staden
DNA sequencing
amplified
assembly software

paired-end sequencing
scaffolding
mate pair
BAC
clones
physical map
hierarchical
map
BACs
PACs
library
sequence-tagged site
restriction digest
restriction enzymes
gel electrophoresis
Bacterial Artificial Chromosome
primer walk

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑