89:, is another example. According to the "law", given some dataset of text, the frequency of a word is inversely proportional to its frequency rank. In other words, the second most common word should appear about half as often as the most common word, and the fifth most common world would appear about once every five times the most common word appears. However, what sets Zipf's law as an "empirical statistical law" rather than just a theorem of linguistics is that it applies to phenomena outside of its field, too. For example, a ranked list of US metropolitan populations also follow Zipf's law, and even
78:
is a popular example of such a "law". It states that roughly 80% of the effects come from 20% of the causes, and is thus also known as the 80/20 rule. In business, the 80/20 rule says that 80% of your business comes from just 20% of your customers. In software engineering, it is often said that 80%
50:
theorems and the term "law" has been carried over to these theorems. There are other statistical and probabilistic theorems that also have "law" as a part of their names that have not obviously derived from
79:
of the errors are caused by just 20% of the bugs. 20% of the world creates roughly 80% of worldwide GDP. 80% of healthcare expenses in the US are caused by 20% of the population.
93:
follows Zipf's law. This act of summarizing several natural data patterns with simple rules is a defining characteristic of these "empirical statistical laws".
59:
in the field of statistics. What distinguishes an empirical statistical law from a formal statistical theorem is the way these patterns simply appear in
284:
482:
468:
336:
416:
155:
322:
122:
112:
501:
190:
42:
and, indeed, across a range of types of data sets. Many of these observances have been formulated and proved as
337:"Chart 1: Percent of Total Health Care Expenses Incurred by Different Percentiles of U.S. Population: 2002"
102:
117:
107:
60:
439:
397:
379:
316:
185:
52:
478:
464:
234:
195:
136:
96:
Examples of empirically inspired statistical laws that have a firm theoretical basis include:
149:
Examples of "laws" which are more general observations than having a theoretical background:
431:
389:
141:
75:
226:
473:
Gelbukh, A., Sidorov, G. (2008). Zipf and Heaps Laws’ Coefficients Depend on
Language. In:
169:
365:"The Area and Population of Cities: New Insights from a Different Perspective on Cities"
435:
56:
364:
495:
259:
82:
47:
443:
401:
17:
86:
43:
311:. United Nations Development Program. New York: Oxford University Press. 1992.
90:
486:
238:
393:
343:. Rockville, MD: Agency for Healthcare Research and Quality. June 2006.
39:
38:
represents a type of behaviour that has been found across a number of
384:
285:"Microsoft's CEO: 80-20 Rule Applies To Bugs, Not Just Features"
55:. However, both types of "law" may be considered instances of a
475:
Computational
Linguistics and Intelligent Text Processing
163:
Examples of supposed "laws" which are incorrect include:
63:, without a prior theoretical reasoning about the data.
27:
Statistical behavior found in a wide variety of datasets
415:
Anderson, John R.; Schooler, Lael J. (November 1991).
130:
Examples of "laws" with a weaker foundation include:
227:"Joseph Juran, 103, Pioneer in Quality Control, Dies"
71:
There are several such popular "laws of statistics".
85:, described as an "empirical statistical law" of
8:
459:Kitcher, P., Salmon, W.C. (Editors) (2009)
417:"Reflections of the Environment in Memory"
383:
208:
314:
7:
477:(pp. 332–335), Springer.
436:10.1111/j.1467-9280.1991.tb00174.x
258:Staff, Investopedia (2010-11-04).
25:
463:. University of Minnesota Press.
215:Kitcher & Salmon (2009) p.51
34:or (in popular terminology) a
1:
309:1992 Human Development Report
353:Gelbukh & Sidorov (2008)
341:Research in Action, Issue 19
283:Rooney, Paula (2002-10-03).
225:Bunkley, Nick (2008-03-03).
518:
191:Category: Statistical laws
123:Regression toward the mean
113:Law of truly large numbers
32:empirical statistical law
372:American Economic Review
363:Gabaix, Xavier (2011).
461:Scientific Explanation
394:10.1257/aer.101.5.2205
321:: CS1 maint: others (
156:Rank–size distribution
103:Statistical regularity
53:empirical observations
424:Psychological Science
118:Central limit theorem
61:natural distributions
108:Law of large numbers
231:The New York Times
483:978-3-540-41687-6
469:978-0-8166-5765-0
196:Law (mathematics)
137:Safety in numbers
36:law of statistics
18:Law of statistics
16:(Redirected from
509:
502:Statistical laws
487:link to abstract
448:
447:
421:
412:
406:
405:
387:
378:(5): 2205–2225.
369:
360:
354:
351:
345:
344:
333:
327:
326:
320:
312:
305:
299:
298:
296:
295:
280:
274:
273:
271:
270:
255:
249:
248:
246:
245:
222:
216:
213:
76:Pareto principle
21:
517:
516:
512:
511:
510:
508:
507:
506:
492:
491:
456:
451:
419:
414:
413:
409:
367:
362:
361:
357:
352:
348:
335:
334:
330:
313:
307:
306:
302:
293:
291:
282:
281:
277:
268:
266:
257:
256:
252:
243:
241:
224:
223:
219:
214:
210:
206:
180:
170:Law of averages
69:
28:
23:
22:
15:
12:
11:
5:
515:
513:
505:
504:
494:
493:
490:
489:
471:
455:
452:
450:
449:
430:(6): 396–408.
407:
355:
346:
328:
300:
275:
250:
217:
207:
205:
202:
201:
200:
199:
198:
193:
188:
186:Laws of chance
179:
176:
175:
174:
173:
172:
161:
160:
159:
158:
147:
146:
145:
144:
139:
128:
127:
126:
125:
120:
115:
110:
105:
68:
65:
57:scientific law
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
514:
503:
500:
499:
497:
488:
484:
480:
476:
472:
470:
466:
462:
458:
457:
453:
445:
441:
437:
433:
429:
425:
418:
411:
408:
403:
399:
395:
391:
386:
381:
377:
373:
366:
359:
356:
350:
347:
342:
338:
332:
329:
324:
318:
310:
304:
301:
290:
286:
279:
276:
265:
261:
254:
251:
240:
236:
232:
228:
221:
218:
212:
209:
203:
197:
194:
192:
189:
187:
184:
183:
182:
181:
177:
171:
168:
167:
166:
165:
164:
157:
154:
153:
152:
151:
150:
143:
142:Benford's law
140:
138:
135:
134:
133:
132:
131:
124:
121:
119:
116:
114:
111:
109:
106:
104:
101:
100:
99:
98:
97:
94:
92:
88:
84:
80:
77:
72:
66:
64:
62:
58:
54:
49:
48:probabilistic
45:
41:
37:
33:
19:
474:
460:
427:
423:
410:
375:
371:
358:
349:
340:
331:
308:
303:
292:. Retrieved
288:
278:
267:. Retrieved
264:Investopedia
263:
260:"80-20 Rule"
253:
242:. Retrieved
230:
220:
211:
162:
148:
129:
95:
81:
73:
70:
35:
31:
29:
87:linguistics
44:statistical
454:References
294:2017-05-05
269:2017-05-05
244:2017-05-05
91:forgetting
83:Zipf's law
385:1001.5289
317:cite book
239:0362-4331
496:Category
178:See also
67:Examples
40:datasets
444:8511110
402:4998367
481:
467:
442:
400:
237:
440:S2CID
420:(PDF)
398:S2CID
380:arXiv
368:(PDF)
204:Notes
479:ISBN
465:ISBN
323:link
235:ISSN
74:The
432:doi
390:doi
376:101
289:CRN
46:or
30:An
498::
485:.
438:.
426:.
422:.
396:.
388:.
374:.
370:.
339:.
319:}}
315:{{
287:.
262:.
233:.
229:.
446:.
434::
428:2
404:.
392::
382::
325:)
297:.
272:.
247:.
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.