Knowledge (XXG)

Empirical statistical laws

Source 📝

89:, is another example. According to the "law", given some dataset of text, the frequency of a word is inversely proportional to its frequency rank. In other words, the second most common word should appear about half as often as the most common word, and the fifth most common world would appear about once every five times the most common word appears. However, what sets Zipf's law as an "empirical statistical law" rather than just a theorem of linguistics is that it applies to phenomena outside of its field, too. For example, a ranked list of US metropolitan populations also follow Zipf's law, and even 78:
is a popular example of such a "law". It states that roughly 80% of the effects come from 20% of the causes, and is thus also known as the 80/20 rule. In business, the 80/20 rule says that 80% of your business comes from just 20% of your customers. In software engineering, it is often said that 80%
50:
theorems and the term "law" has been carried over to these theorems. There are other statistical and probabilistic theorems that also have "law" as a part of their names that have not obviously derived from
79:
of the errors are caused by just 20% of the bugs. 20% of the world creates roughly 80% of worldwide GDP. 80% of healthcare expenses in the US are caused by 20% of the population.
93:
follows Zipf's law. This act of summarizing several natural data patterns with simple rules is a defining characteristic of these "empirical statistical laws".
59:
in the field of statistics. What distinguishes an empirical statistical law from a formal statistical theorem is the way these patterns simply appear in
284: 482: 468: 336: 416: 155: 322: 122: 112: 501: 190: 42:
and, indeed, across a range of types of data sets. Many of these observances have been formulated and proved as
337:"Chart 1: Percent of Total Health Care Expenses Incurred by Different Percentiles of U.S. Population: 2002" 102: 117: 107: 60: 439: 397: 379: 316: 185: 52: 478: 464: 234: 195: 136: 96:
Examples of empirically inspired statistical laws that have a firm theoretical basis include:
149:
Examples of "laws" which are more general observations than having a theoretical background:
431: 389: 141: 75: 226: 473:
Gelbukh, A., Sidorov, G. (2008). Zipf and Heaps Laws’ Coefficients Depend on Language. In:
169: 365:"The Area and Population of Cities: New Insights from a Different Perspective on Cities" 435: 56: 364: 495: 259: 82: 47: 443: 401: 17: 86: 43: 311:. United Nations Development Program. New York: Oxford University Press. 1992. 90: 486: 238: 393: 343:. Rockville, MD: Agency for Healthcare Research and Quality. June 2006. 39: 38:
represents a type of behaviour that has been found across a number of
384: 285:"Microsoft's CEO: 80-20 Rule Applies To Bugs, Not Just Features" 55:. However, both types of "law" may be considered instances of a 475:
Computational Linguistics and Intelligent Text Processing
163:
Examples of supposed "laws" which are incorrect include:
63:, without a prior theoretical reasoning about the data. 27:
Statistical behavior found in a wide variety of datasets
415:
Anderson, John R.; Schooler, Lael J. (November 1991).
130:
Examples of "laws" with a weaker foundation include:
227:"Joseph Juran, 103, Pioneer in Quality Control, Dies" 71:
There are several such popular "laws of statistics".
85:, described as an "empirical statistical law" of 8: 459:Kitcher, P., Salmon, W.C. (Editors) (2009) 417:"Reflections of the Environment in Memory" 383: 208: 314: 7: 477:(pp. 332–335), Springer. 436:10.1111/j.1467-9280.1991.tb00174.x 258:Staff, Investopedia (2010-11-04). 25: 463:. University of Minnesota Press. 215:Kitcher & Salmon (2009) p.51 34:or (in popular terminology) a 1: 309:1992 Human Development Report 353:Gelbukh & Sidorov (2008) 341:Research in Action, Issue 19 283:Rooney, Paula (2002-10-03). 225:Bunkley, Nick (2008-03-03). 518: 191:Category: Statistical laws 123:Regression toward the mean 113:Law of truly large numbers 32:empirical statistical law 372:American Economic Review 363:Gabaix, Xavier (2011). 461:Scientific Explanation 394:10.1257/aer.101.5.2205 321:: CS1 maint: others ( 156:Rank–size distribution 103:Statistical regularity 53:empirical observations 424:Psychological Science 118:Central limit theorem 61:natural distributions 108:Law of large numbers 231:The New York Times 483:978-3-540-41687-6 469:978-0-8166-5765-0 196:Law (mathematics) 137:Safety in numbers 36:law of statistics 18:Law of statistics 16:(Redirected from 509: 502:Statistical laws 487:link to abstract 448: 447: 421: 412: 406: 405: 387: 378:(5): 2205–2225. 369: 360: 354: 351: 345: 344: 333: 327: 326: 320: 312: 305: 299: 298: 296: 295: 280: 274: 273: 271: 270: 255: 249: 248: 246: 245: 222: 216: 213: 76:Pareto principle 21: 517: 516: 512: 511: 510: 508: 507: 506: 492: 491: 456: 451: 419: 414: 413: 409: 367: 362: 361: 357: 352: 348: 335: 334: 330: 313: 307: 306: 302: 293: 291: 282: 281: 277: 268: 266: 257: 256: 252: 243: 241: 224: 223: 219: 214: 210: 206: 180: 170:Law of averages 69: 28: 23: 22: 15: 12: 11: 5: 515: 513: 505: 504: 494: 493: 490: 489: 471: 455: 452: 450: 449: 430:(6): 396–408. 407: 355: 346: 328: 300: 275: 250: 217: 207: 205: 202: 201: 200: 199: 198: 193: 188: 186:Laws of chance 179: 176: 175: 174: 173: 172: 161: 160: 159: 158: 147: 146: 145: 144: 139: 128: 127: 126: 125: 120: 115: 110: 105: 68: 65: 57:scientific law 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 514: 503: 500: 499: 497: 488: 484: 480: 476: 472: 470: 466: 462: 458: 457: 453: 445: 441: 437: 433: 429: 425: 418: 411: 408: 403: 399: 395: 391: 386: 381: 377: 373: 366: 359: 356: 350: 347: 342: 338: 332: 329: 324: 318: 310: 304: 301: 290: 286: 279: 276: 265: 261: 254: 251: 240: 236: 232: 228: 221: 218: 212: 209: 203: 197: 194: 192: 189: 187: 184: 183: 182: 181: 177: 171: 168: 167: 166: 165: 164: 157: 154: 153: 152: 151: 150: 143: 142:Benford's law 140: 138: 135: 134: 133: 132: 131: 124: 121: 119: 116: 114: 111: 109: 106: 104: 101: 100: 99: 98: 97: 94: 92: 88: 84: 80: 77: 72: 66: 64: 62: 58: 54: 49: 48:probabilistic 45: 41: 37: 33: 19: 474: 460: 427: 423: 410: 375: 371: 358: 349: 340: 331: 308: 303: 292:. Retrieved 288: 278: 267:. Retrieved 264:Investopedia 263: 260:"80-20 Rule" 253: 242:. Retrieved 230: 220: 211: 162: 148: 129: 95: 81: 73: 70: 35: 31: 29: 87:linguistics 44:statistical 454:References 294:2017-05-05 269:2017-05-05 244:2017-05-05 91:forgetting 83:Zipf's law 385:1001.5289 317:cite book 239:0362-4331 496:Category 178:See also 67:Examples 40:datasets 444:8511110 402:4998367 481:  467:  442:  400:  237:  440:S2CID 420:(PDF) 398:S2CID 380:arXiv 368:(PDF) 204:Notes 479:ISBN 465:ISBN 323:link 235:ISSN 74:The 432:doi 390:doi 376:101 289:CRN 46:or 30:An 498:: 485:. 438:. 426:. 422:. 396:. 388:. 374:. 370:. 339:. 319:}} 315:{{ 287:. 262:. 233:. 229:. 446:. 434:: 428:2 404:. 392:: 382:: 325:) 297:. 272:. 247:. 20:)

Index

Law of statistics
datasets
statistical
probabilistic
empirical observations
scientific law
natural distributions
Pareto principle
Zipf's law
linguistics
forgetting
Statistical regularity
Law of large numbers
Law of truly large numbers
Central limit theorem
Regression toward the mean
Safety in numbers
Benford's law
Rank–size distribution
Law of averages
Laws of chance
Category: Statistical laws
Law (mathematics)
"Joseph Juran, 103, Pioneer in Quality Control, Dies"
ISSN
0362-4331
"80-20 Rule"
"Microsoft's CEO: 80-20 Rule Applies To Bugs, Not Just Features"
cite book
link

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.