Knowledge (XXG)

Bayesian tool for methylation analysis

Source 📝

43: 469:
It is quite time-consuming (it can take several days to analyse one chromosome). (Note: In one government lab, running Batman on a set of 100 Agilent Human DNA Methylation Arrays (about 250,000 probes per array) took less than an hour to complete in Agilent's Genomic Workbench software. Our computer
359: 106:
all the CpGs are methylated). Therefore, to get the full picture of methylation for a given region you have to normalize the signal you get from the MeDIP experiment to the number of CpGs in the region, and this is what the Batman
217:
Based on these assumptions, the signal from the MeDIP channel of the MeDIP-chip or MeDIP-seq experiment depends on the degree of enrichment of DNA fragments overlapping that probe, which in turn depends on the amount of
65:
to isolate methylated DNA sequences. The isolated fragments of DNA are either hybridized to a microarray chip (MeDIP-chip) or sequenced by next-generation sequencing (MeDIP-seq). While this tells you what areas of the
222:, and thus to the number of methylated CpGs on those fragments. In Batman model, the complete dataset from a MeDIP/chip experiment, A, can be represented by a statistical model in the form of the following 127:
The core principle of the Batman algorithm is to model the effects of varying density of CpG dinucleotides, and the effect this has on MeDIP enrichment of DNA fragments. The basic assumptions of Batman:
98:, it will bind both these regions equally and subsequent steps will therefore show equal signals for these two regions. This does not give the full picture of methylation in these two regions (in region 439:) for each tiled region of the genome, then summarizes the most likely methylation state in 100-bp windows by fitting beta distributions to these samples. The modes of the most likely 232: 155:
CpG methylation state is generally highly correlated over hundreds of bases, so CpGs grouped together in 50- or 100-bp windows would have the same methylation state.
382: 419:), that is, the distribution of likely methylation states given one or more sets of MeDIP-chip/MeDIP-seq outputs. To solve this inference problem, Batman uses 592:
Dodge, J.E., Ramsahoye, B.H., Wo, Z.G., Okano, M. & Li, E. De novo methylation of MMLV provirus in embryonic stem cells: CpG versus non-CpG methylation.
636: 190: : total CpG influence parameter, is defined as the sum of coupling factors for any given probe, which provides a measure of local CpG density 24: 487:
One of the basic assumptions of Batman is that all DNA methylation occurs at CpG dinucleotides. While this is generally the case for
119:(i.e. The region is 100% methylated). In this way Batman converts the signals from MeDIP experiments to absolute methylation levels. 139:
Most CpG-poor regions are constitutively methylated while most CpG-rich regions (CpG islands) are constitutively unmethylated.
631: 400: 32: 491:
somatic cells, there are situations where there is widespread non-CpG methylation, such as in plant cells and
223: 354:{\displaystyle f(A\mid m)=\prod _{p}\phi \left(A_{p}\mid A_{\text{base}}+r\sum _{c}C_{cp},\nu ^{-1}\right),} 484:(a loss of 0.4 compared to normal) would have to be multiplied by 1.25 (=2/1.6) to compensate for the loss. 512:
Down, T.A. et al. A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis.
466:
Because it is non-commercial, there is very little support when using Batman beyond what is in the manual.
460: 142:
There are no fragment biases in MeDIP experiment (approximate range of DNA fragment sizes is 400–700 bp).
473: 219: 404: 70:
are methylated, it does not give absolute methylation levels. Imagine two different genomic regions,
492: 397: 210: 111:
does. Analysing the MeDIP signal of the above example would give Batman scores of 0.5 for region
477: 463:. As such it is not especially user-friendly and is quite a computationally technical process. 440: 367: 451:
It may be useful to take the following points into account when considering using Batman:
420: 28: 95: 57:
MeDIP (methylated DNA immunoprecipitation) is an experimental technique used to assess
625: 42: 87: 83: 36: 94:
has three CpGs, all of which are methylated. As the antibody simply recognizes
537: 488: 424: 202: 146: 213:
since the majority samples used in MeDIP studies contain multiple cell-types.
27:(MeDIP) profiles. It can be applied to large datasets generated using either 533: 108: 476:(CNV) has to be accounted for. For example, the score for a region with a 456: 172: 62: 481: 133: 67: 41: 576:. DNA methylation profiling of human chromosomes 6, 20 and 22. 58: 35:(MeDIP-seq), providing a quantitative estimation of absolute 556:
Bird, A. DNA methylation patterns and epigenetic memory.
167:: coupling factor between probe p and CpG dinucleotide 152:
Only methylated CpGs contribute to the observed signal.
370: 235: 115:(i.e. the region is 50% methylated) and 1 for region 102:
only half the CpGs are methylated, whereas in region
470:
had a 2GHz processor, 24 GB RAM, 64-bit Windows 7.)
376: 353: 610:Current Topics in Microbiology and Immunology 8: 608:Vanyushin, B.F. DNA methylation in plants. 427:) to generate 100 independent samples from 425:http://www.inference.phy.cam.ac.uk/bayesys/ 197: : the methylation status at position 82:has six CpGs (DNA methylation in mammalian 205:in the sample on which it is methylated. m 459:; it is an algorithm performed using the 369: 334: 318: 308: 292: 279: 261: 234: 90:), three of which are methylated. Region 149:are normally distributed with precision. 505: 443:were used as final methylation calls. 23:, is a statistical tool for analysing 17:Bayesian tool for methylation analysis 7: 171:, is defined as the fraction of DNA 536:at base resolution show widespread 201:, which represents the fraction of 637:Applications of Bayesian inference 25:methylated DNA immunoprecipitation 14: 407:techniques can be used to infer 61:methylation levels by using an 39:state in a region of interest. 251: 239: 132:Almost all DNA methylation in 1: 136:happens at CpG dinucleotides. 401:probability density function 159:Basic parameters in Batman: 653: 179:that contain the CpG  33:next-generation sequencing 455:Batman is not a piece of 224:probability distribution 86:generally occurs at CpG 558:Genes & Development 31:arrays (MeDIP-chip) or 378: 355: 49: 632:Computational science 474:Copy number variation 379: 377:{\displaystyle \phi } 356: 175:hybridizing to probe 123:Development of Batman 45: 514:Nature Biotechnology 493:embryonic stem cells 368: 233: 211:continuous variable 209:is considered as a 441:beta distributions 374: 351: 313: 266: 145:The errors on the 50: 583:, 1378–85 (2006). 304: 295: 257: 644: 616: 606: 600: 590: 584: 570: 564: 554: 548: 547:, 315–22 (2009). 526: 520: 519:, 779–85 (2008). 510: 383: 381: 380: 375: 360: 358: 357: 352: 347: 343: 342: 341: 326: 325: 312: 297: 296: 293: 284: 283: 265: 220:antibody binding 19:, also known as 652: 651: 647: 646: 645: 643: 642: 641: 622: 621: 620: 619: 615:, 67–122 (2006) 607: 603: 591: 587: 578:Nature Genetics 571: 567: 555: 551: 527: 523: 511: 507: 502: 449: 421:nested sampling 366: 365: 330: 314: 288: 275: 274: 270: 231: 230: 208: 196: 189: 166: 125: 55: 47:Batman workflow 29:oligonucleotide 12: 11: 5: 650: 648: 640: 639: 634: 624: 623: 618: 617: 601: 585: 565: 563:, 6–21 (2002). 549: 521: 504: 503: 501: 498: 497: 496: 485: 471: 467: 464: 461:command prompt 448: 445: 373: 362: 361: 350: 346: 340: 337: 333: 329: 324: 321: 317: 311: 307: 303: 300: 291: 287: 282: 278: 273: 269: 264: 260: 256: 253: 250: 247: 244: 241: 238: 215: 214: 206: 194: 191: 187: 184: 164: 157: 156: 153: 150: 143: 140: 137: 124: 121: 96:methylated DNA 54: 51: 13: 10: 9: 6: 4: 3: 2: 649: 638: 635: 633: 630: 629: 627: 614: 611: 605: 602: 599:, 41–8 (2002) 598: 595: 589: 586: 582: 579: 575: 572:Eckhardt, F. 569: 566: 562: 559: 553: 550: 546: 543: 540:differences. 539: 535: 531: 525: 522: 518: 515: 509: 506: 499: 494: 490: 486: 483: 479: 475: 472: 468: 465: 462: 458: 454: 453: 452: 446: 444: 442: 438: 434: 430: 426: 422: 418: 414: 410: 406: 402: 399: 395: 391: 387: 371: 348: 344: 338: 335: 331: 327: 322: 319: 315: 309: 305: 301: 298: 289: 285: 280: 276: 271: 267: 262: 258: 254: 248: 245: 242: 236: 229: 228: 227: 225: 221: 212: 204: 200: 192: 185: 182: 178: 174: 170: 162: 161: 160: 154: 151: 148: 144: 141: 138: 135: 131: 130: 129: 122: 120: 118: 114: 110: 105: 101: 97: 93: 89: 88:dinucleotides 85: 84:somatic cells 81: 77: 73: 69: 64: 60: 52: 48: 44: 40: 38: 34: 30: 26: 22: 18: 612: 609: 604: 596: 593: 588: 580: 577: 573: 568: 560: 557: 552: 544: 541: 532:. Human DNA 529: 524: 516: 513: 508: 480:of 1.6 in a 450: 436: 432: 428: 416: 412: 408: 393: 389: 385: 363: 216: 198: 180: 176: 168: 158: 126: 116: 112: 103: 99: 91: 79: 75: 71: 56: 46: 20: 16: 15: 528:Lister, R. 447:Limitations 403:. Standard 203:chromosomes 37:methylation 626:Categories 538:epigenomic 534:methylomes 500:References 489:vertebrate 147:microarray 478:CNV value 372:ϕ 336:− 332:ν 306:∑ 286:∣ 268:ϕ 259:∏ 246:∣ 173:molecules 109:algorithm 78:. Region 457:software 405:Bayesian 398:Gaussian 63:antibody 396:) is a 392:,  134:mammals 542:Nature 482:cancer 364:where 68:genome 53:Theory 21:BATMAN 574:et al 530:et al 594:Gene 294:base 74:and 613:301 597:289 545:462 188:tot 59:DNA 628:: 581:38 561:16 517:26 226:: 165:cp 495:. 437:A 435:| 433:m 431:( 429:f 423:( 417:A 415:| 413:m 411:( 409:f 394:σ 390:μ 388:| 386:x 384:( 349:, 345:) 339:1 328:, 323:p 320:c 316:C 310:c 302:r 299:+ 290:A 281:p 277:A 272:( 263:p 255:= 252:) 249:m 243:A 240:( 237:f 207:c 199:c 195:c 193:m 186:C 183:. 181:c 177:p 169:c 163:C 117:B 113:A 104:B 100:A 92:B 80:A 76:B 72:A

Index

methylated DNA immunoprecipitation
oligonucleotide
next-generation sequencing
methylation

DNA
antibody
genome
somatic cells
dinucleotides
methylated DNA
algorithm
mammals
microarray
molecules
chromosomes
continuous variable
antibody binding
probability distribution
Gaussian
probability density function
Bayesian
nested sampling
http://www.inference.phy.cam.ac.uk/bayesys/
beta distributions
software
command prompt
Copy number variation
CNV value
cancer

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.