Knowledge

Homogeneity and heterogeneity (statistics)

Source 📝

427:, weather datasets are acquired over many years of record and, as part of this, measurements at certain stations may cease occasionally while, at around the same time, measurements may start at nearby locations. There are then questions as to whether, if the records are combined to form a single longer set of records, those records can be considered homogeneous over time. An example of homogeneity testing of wind speed and direction data can be found in Romanić 415:, data-series across a number of sites composed of annual values of the within-year annual maximum river-flow are analysed. A common model is that the distributions of these values are the same for all sites apart from a simple scaling factor, so that the location and scale are linked in a simple way. There can then be questions of examining the homogeneity across sites of the distribution of the scaled values. 123: 103: 389:
Differences in the typical values across the dataset might initially be dealt with by constructing a regression model using certain explanatory variables to relate variations in the typical value to known quantities. There should then be a later stage of analysis to examine whether the errors in the
84:
The concept of homogeneity can be applied in many different ways and, for certain types of statistical analysis, it is used to look for further properties that might need to be treated as varying within a dataset once some initial types of non-homogeneity have been dealt with.
439:
Simple populations surveys may start from the idea that responses will be homogeneous across the whole of a population. Assessing the homogeneity of the population would involve looking to see whether the responses of certain identifiable
402:
The initial stages in the analysis of a time series may involve plotting values against time to examine homogeneity of the series in various ways: stability across time as opposed to a trend; stability of local fluctuations over time.
390:
predictions from the regression behave in the same way across the dataset. Thus the question becomes one of the homogeneity of the distribution of the residuals, as the explanatory variables change. See
42:, or several datasets. They relate to the validity of the often convenient assumption that the statistical properties of any one part of an overall dataset are the same as any other part. In 761:
Romanić D. Ćurić M- Jovičić I. Lompar M. 2015. Long-term trends of the ‘Koshava’ wind during the period 1949–2010. International Journal of Climatology 35(2):288-302. DOI:10.1002/joc.3981.
222:
are also frequently used. “Skedasticity” comes from the Ancient Greek word “skedánnymi”, meaning “to scatter”. Assuming a variable is homoscedastic when in reality it is heteroscedastic (
327:
estimator is still unbiased in the presence of heteroscedasticity, it is inefficient and inference based on the assumption of homoskedasticity is misleading. In that case,
301: 372: 332: 214:; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings 151: 95: 795: 444:
differ from those of others. For example, car-owners may differ from non-car-owners, or there may be differences between different age-groups.
703: 675: 46:, which combines the data from several studies, homogeneity measures the differences or similarities between the several studies (see also 719:
Engle, Robert F. (July 1982). "Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation".
694:; Trivedi, Pravin K. (1993). "Some Specification Tests for the Linear Regression Model". In Bollen, Kenneth A.; Long, J. Scott (eds.). 364: 320: 622: 586: 528:
White, Halbert (1980). "A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity".
351: 638: 316: 817: 20: 800:
Loevinger, J. (1948). The technic of homogeneous tests compared with some aspects of scale analysis and factor analysis.
790:
Krus, D.J., & Blackman, H.S. (1988).Test reliability and homogeneity from perspective of the ordinal test theory.
776: 328: 478: 473: 62: 286: 452:
A test for homogeneity, in the sense of exact equivalence of statistical distributions, can be based on an
61:
of data-values changes throughout a dataset. However, questions of homogeneity apply to all aspects of the
822: 537: 324: 70: 774:
Hall, M.J. (2003) The interpretation of non-homogeneous hydrometeorological time series a case study.
347: 312: 657: 73:. An intermediate-level study might move from looking at the variability to studying changes in the 542: 391: 368: 308: 78: 47: 744: 555: 510: 461: 336: 331:(GLS) was frequently used in the past. Nowadays, standard practice in econometrics is to include 66: 614: 736: 699: 671: 618: 602: 582: 781: 728: 663: 547: 282: 225: 157: 54: 53:
Homogeneity can be studied to several degrees of complexity. For example, considerations of
297: 147: 607: 574: 357: 343: 293: 289: 811: 691: 457: 441: 43: 335:
instead of using GLS, as GLS can exhibit strong bias in small samples if the actual
501: 360: 122: 453: 424: 102: 785: 139: 27: 740: 412: 371:
in the presence of heteroscedasticity, which led to his formulation of the
667: 211: 143: 115: 74: 58: 69:. Thus, a more detailed study would examine changes to the whole of the 748: 559: 514: 39: 126:
Plot with random data showing heteroscedasticity: The variance of the
732: 551: 77:. In addition to these, questions of homogeneity apply also to the 121: 101: 106:
Plot with random data showing homoscedasticity: at each value of
460:
tests the simpler hypothesis that distributions have the same
240: 307:
The existence of heteroscedasticity is a major concern in
252: 243: 234: 181: 172: 166: 581:(Fifth ed.). Boston: McGraw-Hill Irwin. p. 400. 270: 261: 199: 190: 656:
Angrist, Joshua D.; Pischke, Jörn-Steffen (2009-12-31).
130:-values of the dots increases with increasing values of 659:
Mostly Harmless Econometrics: An Empiricist's Companion
499:
McCulloch, J. Huston (1985). "On Heteros*edasticity".
273: 264: 249: 246: 228: 202: 193: 178: 175: 160: 267: 258: 237: 196: 187: 255: 231: 210:) if all its random variables have the same finite 184: 169: 163: 606: 350:of the errors, its presence is referred to as 373:autoregressive conditional heteroscedasticity 333:Heteroskedasticity-consistent standard errors 8: 613:. New York: John Wiley & Sons. pp.  645:. New York: McGraw-Hill. pp. 214–221. 38:, arise in describing the properties of a 541: 497:For the Greek etymology of the term, see 19:For broader coverage of this topic, see 490: 296:, and may result in overestimating the 96:Homoscedasticity and heteroscedasticity 323:all have the same variance. While the 114:-value of the dots has about the same 16:Descriptions of properties of datasets 7: 342:Because heteroscedasticity concerns 696:Testing Structural Equation Models 407:Combining information across sites 365:Nobel Memorial Prize for Economics 14: 792:Applied Measurement in Education, 698:. London: Sage. pp. 66–110. 317:statistical tests of significance 224: 156: 94:This section is an excerpt from 662:. Princeton University Press. 435:Homogeneity within populations 1: 419:Combining information sources 21:Homogeneity and heterogeneity 777:Meteorological Applications 292:and in biased estimates of 839: 375:(ARCH) modeling technique. 93: 18: 786:10.1017/S1350482703005061 329:generalized least squares 63:statistical distributions 479:Reliability (statistics) 474:Consistency (statistics) 802:Psychological Bulletin, 325:ordinary least squares 135: 119: 668:10.1515/9781400829828 603:Goldberger, Arthur S. 363:was awarded the 2003 354:of the second order. 319:that assume that the 125: 105: 71:marginal distribution 57:examine how much the 818:Statistical analysis 315:, as it invalidates 313:analysis of variance 780:, 10, 61–67. 643:Econometric Methods 392:regression analysis 369:regression analysis 367:for his studies on 309:regression analysis 302:Pearson coefficient 300:as measured by the 79:joint distributions 48:Study heterogeneity 804:45, 507–529. 796:(Request reprint). 609:Econometric Theory 579:Basic Econometrics 462:location parameter 337:skedastic function 220:heteroskedasticity 136: 120: 67:location parameter 34:and its opposite, 705:978-0-8039-4506-7 677:978-1-4008-2982-8 573:Gujarati, D. N.; 830: 762: 759: 753: 752: 716: 710: 709: 688: 682: 681: 653: 647: 646: 635: 629: 628: 612: 599: 593: 592: 570: 564: 563: 545: 525: 519: 518: 495: 352:misspecification 321:modelling errors 280: 279: 276: 275: 272: 269: 266: 263: 260: 257: 254: 251: 248: 245: 242: 239: 236: 233: 230: 216:homoskedasticity 209: 208: 205: 204: 201: 198: 195: 192: 189: 186: 183: 180: 177: 174: 171: 168: 165: 162: 148:random variables 65:, including the 55:homoscedasticity 838: 837: 833: 832: 831: 829: 828: 827: 808: 807: 794:1, 79–88 771: 769:Further reading 766: 765: 760: 756: 733:10.2307/1912773 727:(4): 987–1007. 718: 717: 713: 706: 690: 689: 685: 678: 655: 654: 650: 637: 636: 632: 625: 601: 600: 596: 589: 572: 571: 567: 552:10.2307/1912934 527: 526: 522: 498: 496: 492: 487: 470: 450: 437: 421: 409: 400: 387: 382: 377: 376: 298:goodness of fit 294:standard errors 290:point estimates 227: 223: 159: 155: 99: 91: 24: 17: 12: 11: 5: 836: 834: 826: 825: 820: 810: 809: 806: 805: 798: 788: 770: 767: 764: 763: 754: 711: 704: 692:Long, J. Scott 683: 676: 648: 630: 623: 594: 587: 565: 543:10.1.1.11.7646 536:(4): 817–838. 520: 489: 488: 486: 483: 482: 481: 476: 469: 466: 449: 446: 442:subpopulations 436: 433: 420: 417: 408: 405: 399: 396: 386: 383: 381: 378: 358:econometrician 346:of the second 100: 92: 90: 87: 15: 13: 10: 9: 6: 4: 3: 2: 835: 824: 823:Meta-analysis 821: 819: 816: 815: 813: 803: 799: 797: 793: 789: 787: 783: 779: 778: 773: 772: 768: 758: 755: 750: 746: 742: 738: 734: 730: 726: 722: 715: 712: 707: 701: 697: 693: 687: 684: 679: 673: 669: 665: 661: 660: 652: 649: 644: 640: 634: 631: 626: 624:9780471311010 620: 616: 611: 610: 604: 598: 595: 590: 588:9780073375779 584: 580: 576: 575:Porter, D. C. 569: 566: 561: 557: 553: 549: 544: 539: 535: 531: 524: 521: 516: 512: 508: 504: 503: 494: 491: 484: 480: 477: 475: 472: 471: 467: 465: 463: 459: 458:location test 455: 447: 445: 443: 434: 432: 430: 426: 418: 416: 414: 406: 404: 397: 395: 393: 384: 379: 374: 370: 366: 362: 359: 355: 353: 349: 345: 340: 338: 334: 330: 326: 322: 318: 314: 310: 305: 303: 299: 295: 291: 288: 284: 281:) results in 278: 221: 217: 213: 207: 153: 152:homoscedastic 149: 145: 141: 133: 129: 124: 117: 113: 109: 104: 97: 88: 86: 82: 80: 76: 72: 68: 64: 60: 56: 51: 49: 45: 44:meta-analysis 41: 37: 36:heterogeneity 33: 29: 22: 801: 791: 775: 757: 724: 721:Econometrica 720: 714: 695: 686: 658: 651: 642: 639:Johnston, J. 633: 608: 597: 578: 568: 533: 530:Econometrica 529: 523: 506: 502:Econometrica 500: 493: 451: 438: 428: 422: 410: 401: 388: 361:Robert Engle 344:expectations 341: 339:is unknown. 306: 219: 215: 137: 131: 127: 111: 107: 83: 52: 35: 31: 25: 454:E-statistic 425:meteorology 398:Time series 287:inefficient 89:Of variance 59:variability 32:homogeneity 812:Categories 509:(2): 483. 485:References 385:Regression 140:statistics 28:statistics 741:0012-9682 538:CiteSeerX 431:., 2015. 413:hydrology 641:(1972). 605:(1964). 577:(2009). 468:See also 380:Examples 311:and the 283:unbiased 212:variance 144:sequence 116:variance 75:skewness 749:1912773 615:238–243 560:1912934 515:1911250 40:dataset 747:  739:  702:  674:  621:  585:  558:  540:  513:  348:moment 110:, the 745:JSTOR 556:JSTOR 511:JSTOR 448:Tests 429:et al 737:ISSN 700:ISBN 672:ISBN 619:ISBN 583:ISBN 456:. A 356:The 285:but 218:and 142:, a 782:doi 729:doi 664:doi 548:doi 423:In 411:In 150:is 146:of 138:In 50:). 26:In 814:: 743:. 735:. 725:50 723:. 670:. 617:. 554:. 546:. 534:48 532:. 507:53 505:. 464:. 394:. 304:. 244:oʊ 241:ər 173:oʊ 167:oʊ 81:. 30:, 784:: 751:. 731:: 708:. 680:. 666:: 627:. 591:. 562:. 550:: 517:. 277:/ 274:k 271:ɪ 268:t 265:s 262:æ 259:d 256:ˈ 253:ə 250:k 247:s 238:t 235:ɛ 232:h 229:ˌ 226:/ 206:/ 203:k 200:ɪ 197:t 194:s 191:æ 188:d 185:ˈ 182:ə 179:k 176:s 170:m 164:h 161:ˌ 158:/ 154:( 134:. 132:x 128:y 118:. 112:y 108:x 98:. 23:.

Index

Homogeneity and heterogeneity
statistics
dataset
meta-analysis
Study heterogeneity
homoscedasticity
variability
statistical distributions
location parameter
marginal distribution
skewness
joint distributions
Homoscedasticity and heteroscedasticity

variance

statistics
sequence
random variables
homoscedastic
/ˌhmskəˈdæstɪk/
variance
/ˌhɛtərskəˈdæstɪk/
unbiased
inefficient
point estimates
standard errors
goodness of fit
Pearson coefficient
regression analysis

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.