Overdispersion - Knowledge (XXG)

303:(Gaussian) has variance as a parameter, any data with finite variance (including any finite data) can be modeled with a normal distribution with the exact variance – the normal distribution is a two-parameter model, with mean and variance. Thus, in the absence of an underlying model, there is no notion of data being overdispersed relative to the normal model, though the fit may be poor in other respects (such as the higher moments of 36: 371:, however, meanings have been transposed, so that overdispersion is actually taken to mean more even (lower variance) than expected. This confusion has caused some ecologists to suggest that the terms 'aggregated', or 'contagious', would be better used in ecology for 'overdispersed'. Such preferences are creeping into 252:

distribution is a popular and analytically tractable alternative model to the binomial distribution since it provides a better fit to the observed data. To capture the heterogeneity of the families, one can think of the probability parameter of the binomial model (say, probability of being a boy) is

244:

for one possible explanation) i.e. there are more all-boy families, more all-girl families and not enough families close to the population 51:49 boy-to-girl mean ratio than expected from a binomial distribution, and the resulting empirical variance is larger than specified by a binomial model.

208:. The Poisson distribution has one free parameter and does not allow for the variance to be adjusted independently of the mean. The choice of a distribution from the Poisson family is often dictated by the nature of the empirical data. For example, 330:

of repeated surveys of a fixed population (say with a given sample size, so margin of error is the same), one expects the results to fall on normal distribution with standard deviation equal to the margin of error. However, in the presence of

347:

all with a margin of error of 3%, if they are conducted by different polling organizations, one expects the results to have standard deviation greater than 3%, due to pollster bias from different methodologies.

187:

means that there was less variation in the data than predicted. Overdispersion is a very common feature in applied data analysis because in practice, populations are frequently

311:, etc.). However, in the case that the data is modeled by a normal distribution with an expected variation, it can be over- or under-dispersed relative to that prediction. 291:

With respect to binomial random variables, the concept of overdispersion makes sense only if n>1 (i.e. overdispersion is nonsensical for Bernoulli random variables).

581: 216:. If overdispersion is a feature, an alternative model with additional free parameters may provide a better fit. In the case of count data, a Poisson 224:

can be proposed instead, in which the mean of the Poisson distribution can itself be thought of as a random variable drawn – in this case – from the

228:

thereby introducing an additional free parameter (note the resulting negative binomial distribution is completely characterized by two parameters).

171:. However, especially for simple models with few parameters, theoretical predictions may not match empirical observations for higher 559: 532: 400: 340: 217: 119: 53: 241: 364:, the term 'overdispersion' is generally used as defined here – meaning a distribution with a higher than expected variance. 221: 100: 57: 236:

As a more concrete example, it has been observed that the number of boys born to families does not conform faithfully to a

72: 240:

as might be expected. Instead, the sex ratios of families seem to skew toward either boys or girls (see, for example the

79: 46: 591: 284:. In this case, if the variance of the normal variable is zero, the model reduces to the standard (undispersed) 86: 586: 383: 163:

of the chosen model. It is usually possible to choose the model parameters in such a way that the theoretical

204:

Overdispersion is often encountered when fitting very simple parametric models, such as those based on the

68: 269: 141: 382:, overdispersion is often evident in the analysis of death count data, but demographers prefer the term ' 262: 237: 254: 205: 172: 395: 332: 300: 285: 277: 273: 249: 503: 357: 315: 225: 209: 343:

and will be overdistributed relative to the predicted distribution. For example, given repeated

191:(non-uniform) contrary to the assumptions implicit within widely used simple parametric models. 555: 528: 495: 454: 446: 375:

too. Generally this suggestion has not been heeded, and confusion persists in the literature.

258: 145: 485: 438: 405: 281: 156: 93: 288:. This model has an additional free parameter, namely the variance of the normal variable. 319: 164: 160: 323: 575: 336: 327: 188: 159:

to fit a given set of empirical observations. This necessitates an assessment of the

507: 372: 361: 344: 522: 168: 35: 356:

Over- and underdispersion are terms which have been adopted in branches of the

17: 549: 379: 268:

Another common model for overdispersion—when some of the observations are not

213: 152: 133: 450: 490: 473: 426: 499: 458: 442: 308: 304: 176: 368: 326:

and hence dispersion of results on repeated surveys. If one performs a

27:

Presence of greater variability in a data set than would be expected

474:"Analysis of the Human Sex Ratio by using Overdispersion Models" 29: 427:"The most widely publicized gender problem in human genetics" 425:

Stansfield, William D.; Carlton, Matthew A. (February 2009).

280:. Software is widely available for fitting this type of 144:) in a data set than would be expected based on a given 179:

is higher than the variance of a theoretical model,

60:. Unsourced material may be challenged and removed. 527:(Third ed.). University of California Press. 478:Journal of the Royal Statistical Society, Series C 265:(beta-binomial) has an additional free parameter. 8: 352:Differences in terminology among disciplines 167:of the model is approximately equal to the 261:as the mixing distribution. The resulting 489: 472:Lindsey, J. K.; Altham, P. M. E. (1998). 322:(determined by sample size) predicts the 120:Learn how and when to remove this message 140:is the presence of greater variability ( 417: 7: 58:adding citations to reliable sources 212:analysis is commonly used to model 25: 551:Evolutionary Ecology of Parasites 401:Compound probability distribution 582:Probability distribution fitting 339:, the distribution is instead a 34: 257:) drawn for each family from a 253:itself a random variable (i.e. 45:needs additional citations for 554:. Princeton University Press. 222:negative binomial distribution 1: 335:where studies have different 272:—arises from introducing a 608: 524:Quantitative Plant Ecology 242:Trivers–Willard hypothesis 183:has occurred. Conversely, 151:A common task in applied 521:Greig-Smith, P. (1983). 384:unobserved heterogeneity 491:10.1111/1467-9876.00103 274:normal random variable 142:statistical dispersion 341:compound distribution 263:compound distribution 238:binomial distribution 443:10.3378/027.081.0101 255:random effects model 206:Poisson distribution 175:. When the observed 54:improve this article 548:Poulin, R. (2006). 396:Index of dispersion 358:biological sciences 333:study heterogeneity 301:normal distribution 295:Normal distribution 286:logistic regression 250:beta-binomial model 316:statistical survey 314:For example, in a 248:In this case, the 226:gamma distribution 210:Poisson regression 367:In some areas of 259:beta distribution 146:statistical model 130: 129: 122: 104: 16:(Redirected from 599: 592:Spatial analysis 566: 565: 545: 539: 538: 518: 512: 511: 493: 469: 463: 462: 422: 406:Quasi-likelihood 282:multilevel model 157:parametric model 125: 118: 114: 111: 105: 103: 69:"Overdispersion" 62: 38: 30: 21: 607: 606: 602: 601: 600: 598: 597: 596: 587:Point processes 572: 571: 570: 569: 562: 547: 546: 542: 535: 520: 519: 515: 471: 470: 466: 424: 423: 419: 414: 392: 378:Furthermore in 354: 320:margin of error 297: 234: 202: 197: 185:underdispersion 165:population mean 126: 115: 109: 106: 63: 61: 51: 39: 28: 23: 22: 18:Underdispersion 15: 12: 11: 5: 605: 603: 595: 594: 589: 584: 574: 573: 568: 567: 560: 540: 533: 513: 484:(1): 149–157. 464: 416: 415: 413: 410: 409: 408: 403: 398: 391: 388: 353: 350: 324:sampling error 296: 293: 278:logistic model 233: 230: 201: 198: 196: 193: 181:overdispersion 155:is choosing a 138:overdispersion 128: 127: 42: 40: 33: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 604: 593: 590: 588: 585: 583: 580: 579: 577: 563: 561:9780691120850 557: 553: 552: 544: 541: 536: 534:0-632-00142-9 530: 526: 525: 517: 514: 509: 505: 501: 497: 492: 487: 483: 479: 475: 468: 465: 460: 456: 452: 448: 444: 440: 436: 432: 431:Human Biology 428: 421: 418: 411: 407: 404: 402: 399: 397: 394: 393: 389: 387: 385: 381: 376: 374: 370: 365: 363: 359: 351: 349: 346: 345:opinion polls 342: 338: 337:sampling bias 334: 329: 328:meta-analysis 325: 321: 317: 312: 310: 306: 302: 294: 292: 289: 287: 283: 279: 275: 271: 266: 264: 260: 256: 251: 246: 243: 239: 231: 229: 227: 223: 219: 218:mixture model 215: 211: 207: 199: 194: 192: 190: 189:heterogeneous 186: 182: 178: 174: 170: 166: 162: 158: 154: 149: 147: 143: 139: 135: 124: 121: 113: 102: 99: 95: 92: 88: 85: 81: 78: 74: 71: – 70: 66: 65:Find sources: 59: 55: 49: 48: 43:This article 41: 37: 32: 31: 19: 550: 543: 523: 516: 481: 477: 467: 434: 430: 420: 377: 373:parasitology 366: 362:parasitology 355: 313: 298: 290: 267: 247: 235: 203: 184: 180: 150: 137: 131: 116: 110:January 2008 107: 97: 90: 83: 76: 64: 52:Please help 47:verification 44: 437:(1): 3–11. 169:sample mean 576:Categories 412:References 380:demography 214:count data 153:statistics 134:statistics 80:newspapers 451:1534-6617 270:Bernoulli 220:like the 508:22354905 500:12293397 459:19589015 390:See also 309:kurtosis 232:Binomial 195:Examples 177:variance 369:ecology 299:As the 276:into a 200:Poisson 173:moments 94:scholar 558: 531: 506: 498: 457: 449: 318:, the 96: 89: 82: 75: 67: 504:S2CID 360:. In 101:JSTOR 87:books 556:ISBN 529:ISBN 496:PMID 455:PMID 447:ISSN 305:skew 73:news 486:doi 439:doi 386:'. 161:fit 132:In 56:by 578:: 502:. 494:. 482:47 480:. 476:. 453:. 445:. 435:81 433:. 429:. 307:, 148:. 136:, 564:. 537:. 510:. 488:: 461:. 441:: 123:) 117:( 112:) 108:( 98:· 91:· 84:· 77:· 50:. 20:)

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.