Knowledge (XXG)

Statistical model specification

Source đź“ť

448: 488:. Theoretical understanding can then guide the modification of the model in such a way as to retain theoretical validity while removing the sources of misspecification. But if it proves impossible to find a theoretically acceptable specification that fits the data, the theoretical model may have to be rejected and replaced with another one. 498:
Another approach to model building is to specify several different models as candidates, and then compare those candidate models to each other. The purpose of the comparison is to determine which candidate model is most appropriate for statistical inference. Common criteria for comparing models
481:
One approach is to start with a model in general form that relies on a theoretical understanding of the data-generating process. Then the model can be fit to the data and checked for the various sources of misspecification, in a task called
306:
and the true underlying value) occurs if an independent variable is correlated with the errors inherent in the underlying process. There are several different possible causes of specification error; some are listed below.
239: 495:
is apposite here: "Whenever a theory appears to you as the only possible one, take this as a sign that you have neither understood the theory nor the problem which it was intended to solve".
478:
Building a model involves finding a set of relationships to represent the process that is generating the data. This requires avoiding all the sources of misspecification mentioned above.
269: 262: 137: 417: 397: 96: 76: 56: 145: 1016: 355:"Modeling is an art as well as a science and is directed toward finding a good approximating model ... as the basis for statistical inference". 975: 851: 375:
In the example given above relating personal income to schooling and job experience, if the assumptions of the model are correct, then the
792: 432: 282:
has said, "How translation from subject-matter problem to statistical model is done is often the most critical part of an analysis".
891: 788:"Statistical model specification and power: recommendations on the use of test-qualified pooling in analysis of experimental data" 548: 599: 558: 938: 757:
Proceedings of the First US/JAPAN Conference on The Frontiers of Statistical Modeling: An Informational Approach—Volume 3
755:(1994), "Implications of informational point of view on the development of statistical science", in Bozdogan, H. (ed.), 670: 500: 484: 333: 1021: 609: 519: 760: 651: 563: 347:
Note that all models will have some specification error. Indeed, in statistics there is a common aphorism that "
643: 630: 344:
may affect the independent variables: while this is not a specification error, it can create statistical bias.
279: 734: 579: 420: 589: 32: 986: 604: 594: 553: 510: 319: 584: 428: 348: 325:
An irrelevant variable may be included in the model (although this does not create bias, it involves
291: 265: 910: 831: 533: 514: 369: 272: 247: 994: 947: 933: 778: 573: 341: 315: 883: 876: 971: 967: 901: 887: 847: 843: 819: 707: 568: 424: 365: 295: 101: 28: 773:(2011). "Misspecification: Wrong regressors, measurement errors and wrong functional forms". 919: 809: 801: 706:; Trivedi, Pravin K. (1993). "Some specification tests for the linear regression model". In 538: 402: 962:; Lahiri, Kajal (2009). "Diagnostic checking, model selection, and specification testing". 382: 770: 752: 715: 688: 36: 835: 814: 787: 299: 81: 61: 41: 447: 294:
poorly represent relevant aspects of the true data-generating process. In particular,
1010: 959: 703: 543: 376: 685:
Model Selection and Multimodel Inference: A practical information-theoretic approach
506: 492: 326: 234:{\displaystyle \ln y=\ln y_{0}+\rho s+\beta _{1}x+\beta _{2}x^{2}+\varepsilon } 871: 427:. Hence specification diagnostics usually involve testing the first to fourth 20: 987:"A regression error specification test (RESET) for generalized linear models" 906:"Model specification: The views of Fisher and Neyman, and later developments" 924: 905: 838:(2009). "Econometric modeling: Model specification and diagnostic testing". 303: 823: 805: 35:
for the model and choosing which variables to include. For example, given
314:
A variable omitted from the model may have a relationship with both the
951: 666: 290:
Specification error occurs when the functional form or the choice of
936:(1992). "Model specification tests and artificial regressions". 442: 882:(Second ed.). New York: Macmillan Publishers. pp.  459: 318:
and one or more of the independent variables (causing
405: 385: 250: 148: 104: 84: 64: 44: 31:: specification consists of selecting an appropriate 332:The dependent variable may be part of a system of 311:
An inappropriate functional form could be employed.
16:
Part of the process of building a statistical model
875: 411: 391: 256: 233: 131: 90: 70: 50: 329:and so can lead to poor predictive performance). 731:Objective Knowledge: An evolutionary approach 98:, we might specify a functional relationship 8: 576:, second-order statistical misspecification 351:". In the words of Burnham & Anderson, 923: 813: 404: 384: 368:can help test for specification error in 249: 219: 209: 193: 171: 147: 103: 83: 63: 43: 683:Burnham, K. P.; Anderson, D. R. (2002), 622: 270:independent and identically distributed 786:Colegrave, N.; Ruxton, G. D. (2017). 667:Quantitative Methods II: Econometrics 27:is part of the process of building a 7: 629:This particular example is known as 648:Principles of Statistical Inference 793:Proceedings of the Royal Society B 712:Testing Structural Equation Models 302:of the difference of an estimated 14: 513:together with its generalization 58:together with years of schooling 549:Data transformation (statistics) 446: 600:Statistical conclusion validity 939:Journal of Economic Literature 863:Regression Modeling Strategies 517:. For more on this topic, see 126: 114: 1: 1017:Regression variable selection 671:College of William & Mary 360:Detection of misspecification 268:that is supposed to comprise 964:Introduction to Econometrics 485:statistical model validation 379:estimates of the parameters 286:Specification error and bias 257:{\displaystyle \varepsilon } 610:Statistical learning theory 520:statistical model selection 336:(giving simultaneity bias). 1038: 761:Kluwer Academic Publishers 652:Cambridge University Press 78:and on-the-job experience 710:; Long, J. Scott (eds.). 564:Exploratory data analysis 878:Elements of Econometrics 631:Mincer earnings function 132:{\displaystyle y=f(s,x)} 861:Harrell, Frank (2001), 735:Oxford University Press 580:Information matrix test 499:include the following: 806:10.1098/rspb.2016.1850 590:Principle of Parsimony 559:Durbin–Wu–Hausman test 413: 412:{\displaystyle \beta } 393: 357: 334:simultaneous equations 258: 235: 133: 92: 72: 52: 985:Sapra, Sunil (2005). 925:10.1214/ss/1177012164 769:Asteriou, Dimitrios; 729:Popper, Karl (1972), 605:Statistical inference 595:Spurious relationship 554:Design of experiments 511:likelihood-ratio test 414: 394: 392:{\displaystyle \rho } 353: 320:omitted-variable bias 292:independent variables 259: 236: 134: 93: 73: 53: 970:. pp. 401–449. 846:. pp. 467–522. 832:Gujarati, Damodar N. 775:Applied Econometrics 585:Model identification 403: 383: 349:all models are wrong 248: 146: 102: 82: 62: 42: 966:(Fourth ed.). 934:MacKinnon, James G. 911:Statistical Science 781:. pp. 172–197. 777:(Second ed.). 534:Abductive reasoning 515:relative likelihood 370:regression analysis 264:is the unexplained 25:model specification 1022:Statistical models 995:Economics Bulletin 842:(Fifth ed.). 840:Basic Econometrics 800:(1851): 20161850. 779:Palgrave Macmillan 718:. pp. 66–110. 708:Bollen, Kenneth A. 574:Heteroscedasticity 458:. You can help by 409: 389: 342:measurement errors 316:dependent variable 273:Gaussian variables 254: 231: 129: 88: 68: 48: 977:978-0-470-01512-4 853:978-0-07-337577-9 844:McGraw-Hill/Irwin 569:Feature selection 491:A quotation from 476: 475: 366:Ramsey RESET test 278:The statistician 91:{\displaystyle x} 71:{\displaystyle s} 51:{\displaystyle y} 29:statistical model 1029: 1003: 991: 981: 955: 929: 927: 897: 881: 866: 857: 827: 817: 782: 771:Hall, Stephen G. 764: 763:, pp. 27–38 753:Akaike, Hirotugu 739: 737: 726: 720: 719: 700: 694: 692: 687:(2nd ed.), 680: 674: 663: 657: 655: 640: 634: 627: 539:Conceptual model 471: 468: 450: 443: 418: 416: 415: 410: 398: 396: 395: 390: 263: 261: 260: 255: 240: 238: 237: 232: 224: 223: 214: 213: 198: 197: 176: 175: 138: 136: 135: 130: 97: 95: 94: 89: 77: 75: 74: 69: 57: 55: 54: 49: 1037: 1036: 1032: 1031: 1030: 1028: 1027: 1026: 1007: 1006: 989: 984: 978: 958: 932: 900: 894: 870: 860: 854: 836:Porter, Dawn C. 830: 785: 768: 751: 748: 746:Further reading 743: 742: 728: 727: 723: 716:SAGE Publishing 702: 701: 697: 689:Springer-Verlag 682: 681: 677: 664: 660: 642: 641: 637: 628: 624: 619: 614: 529: 472: 466: 463: 456:needs expansion 441: 401: 400: 381: 380: 362: 288: 246: 245: 215: 205: 189: 167: 144: 143: 100: 99: 80: 79: 60: 59: 40: 39: 37:personal income 33:functional form 17: 12: 11: 5: 1035: 1033: 1025: 1024: 1019: 1009: 1008: 1005: 1004: 982: 976: 960:Maddala, G. S. 956: 946:(1): 102–146. 930: 918:(2): 160–168. 902:Lehmann, E. L. 898: 892: 868: 858: 852: 828: 783: 766: 747: 744: 741: 740: 721: 704:Long, J. Scott 695: 675: 658: 635: 621: 620: 618: 615: 613: 612: 607: 602: 597: 592: 587: 582: 577: 571: 566: 561: 556: 551: 546: 541: 536: 530: 528: 525: 474: 473: 453: 451: 440: 439:Model building 437: 408: 388: 361: 358: 340:Additionally, 338: 337: 330: 323: 312: 300:expected value 287: 284: 253: 242: 241: 230: 227: 222: 218: 212: 208: 204: 201: 196: 192: 188: 185: 182: 179: 174: 170: 166: 163: 160: 157: 154: 151: 128: 125: 122: 119: 116: 113: 110: 107: 87: 67: 47: 15: 13: 10: 9: 6: 4: 3: 2: 1034: 1023: 1020: 1018: 1015: 1014: 1012: 1001: 997: 996: 988: 983: 979: 973: 969: 965: 961: 957: 953: 949: 945: 941: 940: 935: 931: 926: 921: 917: 913: 912: 907: 903: 899: 895: 893:0-02-365070-2 889: 885: 880: 879: 873: 869: 864: 859: 855: 849: 845: 841: 837: 833: 829: 825: 821: 816: 811: 807: 803: 799: 795: 794: 789: 784: 780: 776: 772: 767: 762: 758: 754: 750: 749: 745: 736: 732: 725: 722: 717: 713: 709: 705: 699: 696: 690: 686: 679: 676: 672: 668: 662: 659: 654:, p. 197 653: 649: 645: 639: 636: 632: 626: 623: 616: 611: 608: 606: 603: 601: 598: 596: 593: 591: 588: 586: 583: 581: 578: 575: 572: 570: 567: 565: 562: 560: 557: 555: 552: 550: 547: 545: 544:Data analysis 542: 540: 537: 535: 532: 531: 526: 524: 522: 521: 516: 512: 508: 504: 503: 496: 494: 489: 487: 486: 479: 470: 467:February 2019 461: 457: 454:This section 452: 449: 445: 444: 438: 436: 434: 430: 426: 422: 406: 386: 378: 377:least squares 373: 371: 367: 359: 356: 352: 350: 345: 343: 335: 331: 328: 324: 321: 317: 313: 310: 309: 308: 305: 301: 297: 293: 285: 283: 281: 280:Sir David Cox 276: 274: 271: 267: 251: 228: 225: 220: 216: 210: 206: 202: 199: 194: 190: 186: 183: 180: 177: 172: 168: 164: 161: 158: 155: 152: 149: 142: 141: 140: 123: 120: 117: 111: 108: 105: 85: 65: 45: 38: 34: 30: 26: 22: 999: 993: 963: 943: 937: 915: 909: 877: 862: 839: 797: 791: 774: 756: 730: 724: 711: 698: 684: 678: 661: 647: 638: 625: 518: 507:Bayes factor 501: 497: 490: 483: 480: 477: 464: 460:adding to it 455: 374: 363: 354: 346: 339: 289: 277: 243: 139:as follows: 24: 18: 872:Kmenta, Jan 493:Karl Popper 327:overfitting 1011:Categories 865:, Springer 644:Cox, D. R. 509:, and the 266:error term 21:statistics 1002:(1): 1–6. 433:residuals 421:efficient 407:β 387:ρ 304:parameter 252:ε 229:ε 207:β 191:β 181:ρ 165:⁡ 153:⁡ 904:(1990). 874:(1986). 824:28330912 646:(2006), 527:See also 425:unbiased 419:will be 952:2727880 884:442–455 815:5378071 431:of the 974:  950:  890:  850:  822:  812:  691:, §1.1 429:moment 244:where 990:(PDF) 968:Wiley 948:JSTOR 617:Notes 298:(the 972:ISBN 888:ISBN 848:ISBN 820:PMID 423:and 399:and 364:The 296:bias 920:doi 810:PMC 802:doi 798:284 669:", 462:. 275:. 19:In 1013:: 998:. 992:. 944:30 942:. 914:. 908:. 886:. 834:; 818:. 808:. 796:. 790:. 759:, 733:, 714:. 650:, 523:. 505:, 435:. 372:. 322:). 162:ln 150:ln 23:, 1000:3 980:. 954:. 928:. 922:: 916:5 896:. 867:. 856:. 826:. 804:: 765:. 738:. 693:. 673:. 665:" 656:. 633:. 502:R 469:) 465:( 226:+ 221:2 217:x 211:2 203:+ 200:x 195:1 187:+ 184:s 178:+ 173:0 169:y 159:= 156:y 127:) 124:x 121:, 118:s 115:( 112:f 109:= 106:y 86:x 66:s 46:y

Index

statistics
statistical model
functional form
personal income
error term
independent and identically distributed
Gaussian variables
Sir David Cox
independent variables
bias
expected value
parameter
dependent variable
omitted-variable bias
overfitting
simultaneous equations
measurement errors
all models are wrong
Ramsey RESET test
regression analysis
least squares
efficient
unbiased
moment
residuals

adding to it
statistical model validation
Karl Popper
R

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑