Knowledge (XXG)

Diehard tests

Source 📝

387:
counts the number of missing words – that is 2-letter words which do not appear in the entire sequence. That count should be very close to normally distributed with mean 141909, sigma 290. Thus (missingwrds-141909) / 290 should be a standard normal variable. The OPSO test takes 32 bits at a time from the test file and uses a designated set of ten consecutive bits. It then restarts the file for the next designated 10 bits, and so on. OQSO means overlapping-quadruples-sparse-occupancy. The test OQSO is similar, except that it considers 4-letter words from an alphabet of 32 letters, each letter determined by a designated string of 5 consecutive bits from the test file, elements of which are assumed 32-bit random integers. The mean number of missing words in a sequence of 2 four-letter words, (2 + 3 "keystrokes"), is again 141909, with sigma = 295. The mean is based on theory; sigma comes from extensive simulation. The DNA test considers an alphabet of 4 letters C, G, A, T, determined by two designated bits in the sequence of random integers being tested. It considers 10-letter words, so that as in OPSO and OQSO, there are 2 possible words, and the mean number of missing words from a string of 2 (overlapping) 10-letter words (2 + 9 "keystrokes") is 141909. The standard deviation sigma = 339 was determined as for OQSO by simulation. (Sigma for OPSO, 290, is the true value (to three places), not determined by simulation.
405:
string of (overlapping) 5-letter words, each "letter" taking values A, B, C, D, E. The letters are determined by the number of 1s, in that byte 0, 1, or 2 → A, 3 → B, 4 → C, 5 → D, and 6, 7 or 8 → E. Thus we have a monkey at a typewriter hitting five keys with various probabilities 37, 56, 70, 56, 37 over 256. There are 5 possible 5-letter words, and from a string of 256000 (overlapping) 5-letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a chisquare test Q5 – Q4, the difference of the naive Pearson sums of
394:
The letters are determined by the number of 1s in a byte 0, 1, or 2 yield A, 3 yields B, 4 yields C, 5 yields D and 6, 7 or 8 yield E. Thus we have a monkey at a typewriter hitting five keys with various probabilities (37, 56, 70, 56, 37 over 256). There are 5 possible 5-letter words, and from a string of 256000 (overlapping) 5-letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a chisquare test Q5–Q4, the difference of the naive Pearson sums of
294:
observed, cumulative counts are made of the number of occurrences of each state. Then the quadratic form in the weak inverse of the 120×120 covariance matrix yields a test equivalent to the likelihood ratio test that the 120 cell counts came from the specified (asymptotically) normal distribution with the specified 120×120 covariance matrix (with rank 99). This version uses 1000000 integers, twice. This test may have unresolved bugs resulting in consistently poor p-values.
641:. Throws necessary to complete the game can vary from 1 to infinity, but counts for all > 21 are lumped with 21. A chi-square test is made on the no.-of-throws cell counts. Each 32-bit integer from the test file provides the value for the throw of a die, by floating to [0,1), multiplying by 6 and taking 1 plus the integer part of the result. 321:
From each of six random 32-bit integers from the generator under test, a specified byte is chosen, and the resulting six bytes form a 6×8 binary matrix whose rank is determined. That rank can be from 0 to 6, but ranks 0, 1, 2, 3 are rare; their counts are pooled with those for rank 4. Ranks are found
314:
A random 32×32 binary matrix is formed, each row a 32-bit random integer. The rank is determined. That rank can be from 0 to 32, ranks less than 29 are rare, and their counts are pooled with those for rank 29. Ranks are found for 40000 such random matrices and a chi square test is performed on counts
293:
This is the OPERM5 test. It looks at a sequence of one million 32-bit random integers. Each set of five consecutive integers can be in one of 120 states, for the 5! possible orderings of five numbers. Thus the 5th, 6th, 7th, ... numbers each provide a state. As many thousands of state transitions are
393:
Consider the file under test as a stream of bytes (four per 32-bit integer). Each byte can contain from none to eight 1s, with probabilities 1, 8, 28, 56, 70, 56, 28, 8, 1 over 256. Now let the stream of bytes provide a string of overlapping 5-letter words, each "letter" taking values A, B, C, D, E.
611:
It counts runs up, and runs down, in a sequence of uniform [0,1) variables, obtained by floating the 32-bit integers in the specified file. This example shows how runs are counted: 0.123, 0.357, 0.789, 0.425, 0.224, 0.416, 0.95 contains an up-run of length 3, a down-run of length 2 and an up-run of
404:
Consider the file under test as a stream of 32-bit integers. From each integer, a specific byte is chosen, say the leftmost bits 1 to 8. Each byte can contain from 0 to 8 1s, with probabilities 1, 8, 28, 56, 70, 56, 28, 8, 1 over 256. Now let the specified bytes from successive integers provide a
386:
OPSO means overlapping-pairs-sparse-occupancy. The OPSO test considers 2-letter words from an alphabet of 1024 letters. Each letter is determined by a specified ten bits from a 32-bit integer in the sequence to be tested. OPSO generates 2 (overlapping) 2-letter words (from 2 + 1 "keystrokes") and
612:(at least) 2, depending on the next values. The covariance matrices for the runs-up and runs-down are well known, leading to chi-square tests for quadratic forms in the weak inverses of the covariance matrices. Runs are counted for sequences of length 10000. This is done ten times. Then repeated. 424:
the number successfully parked, we get a curve that should be similar to those provided by a perfect random number generator. Theory for the behavior of such a random curve seems beyond reach, and as graphics displays are not available for this battery of tests, a simple characterization of the
415:
In a square of side 100, randomly "park" a car – a circle of radius 1. Then try to park a 2nd, a 3rd, and so on, each time parking "by ear". That is, if an attempt to park a car causes a crash with one already parked, try again at a new random location. (To avoid path problems, consider parking
485:
should be uniform on [0,1) and a KSTEST on the resulting 100 values serves as a test of uniformity for random points in the square. Test numbers = 0 mod 5 are printed but the KSTEST is based on the full set of 100 random choices of 8000 points in the 10000×10000
278:
value. The first test uses bits 1–24 (counting from the left) from integers in the specified file. Then the file is closed and reopened. Next, bits 2–25 are used to provide birthdays, then 3–26 and so on to bits 9–32. Each set of bits provides a
360:, and so on. The bitstream test counts the number of missing 20-letter (20-bit) words in a string of 2 overlapping 20-letter words. There are 2 possible 20-letter words. For a truly random string of 2 + 19 bits, the number of missing words 135:
Randomly place unit circles in a 100×100 square. A circle is successfully parked if it does not overlap an existing successfully parked one. After 12,000 tries, the number of successfully parked circles should follow a certain
492:
Choose 4000 random points in a cube of edge 1000. At each point, center a sphere large enough to reach the next closest point. Then the volume of the smallest such sphere is (very close to) exponentially distributed with mean
497:/ 3. Thus the radius cubed is exponential with mean 30. (The mean is obtained by extensive simulation). The 3D spheres test generates 4000 such spheres 20 times. Each min radius cubed leads to a uniform variable by means of 156:
Randomly choose 4000 points in a cube of edge 1000. Center a sphere on each point, whose radius is the minimum distance to another point. The smallest sphere's volume should be exponentially distributed with a certain
304:{0,1}. The rank is determined. That rank can be from 0 to 31, but ranks < 28 are rare, and their counts are pooled with those for rank 28. Ranks are found for 40000 such random matrices and a 119:
Treat sequences of some number of bits as "words". Count the overlapping words in a stream. The number of "words" that do not appear should follow a known distribution. The name is based on the
618:
It plays 200000 games of craps, finds the number of wins and the number of throws necessary to end each game. The number of wins should be (very close to) a normal with mean 200000
810: 416:
helicopters rather than cars.) Each attempt leads to either a crash or a success, the latter followed by an increment to the list of cars already parked. If we plot
336:, ... . Consider an alphabet with two "letters", 0 and 1, and think of the stream of bits as a succession of 20-letter "words", overlapping. Thus the first word is b 65:'s Diehard tests (1995) consisting of fifteen different tests. The inability to modify the test parameters or add new tests led to the development of the 129:
Count the 1 bits in each of either successive or chosen bytes. Convert the counts to "letters", and count the occurrences of five-letter "words".
146:
Randomly place 8000 points in a 10000×10000 square, then find the minimum distance between the pairs. The square of this distance should be
942: 783: 677:
is just an asymptotic approximation, for which the fit will be worst in the tails. Thus you should not be surprised with occasional
99:
Analyze sequences of five consecutive random numbers. The 120 possible orderings should occur with statistically equal probability.
445:
should be a standard normal variable, which, converted to a uniform variable, provides input to a KSTEST based on a sample of 10.
271: 53: 167:) until you reach 1. Repeat this 100000 times. The number of floats needed to reach 1 should follow a certain distribution. 601:
s converts them to a sequence of independent standard normals, which are converted to uniform variables for a KSTEST. The
177:). Add sequences of 100 consecutive floats. The sums should be normally distributed with characteristic mean and variance. 180: 697:
s happen among the hundreds of events DIEHARD produces, even conditioned on the random number generator being perfect.
70: 300:
The leftmost 31 bits of 31 random integers from the test sequence are used to form a 31×31 binary matrix over the
910: 735: 477:, the square of the minimum distance should be (very close to) exponentially distributed with mean 0.995. Thus 147: 86: 906: 731: 120: 28: 693:> 0.975 means that the RNG has "failed the test at the 0.05 level". We expect a number of such events 649:-value, which should be uniform on [0,1) if the input file contains truly independent random bits. Those 110: 884:
PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation
199:, counting the wins and the number of throws per game. Each count should follow a certain distribution. 106: 891: 301: 229: 137: 85:
Choose random points on a large interval. The spacings between the points should be asymptotically
867: 849: 805: 681:-values near 0 or 1, such as 0.0012 or 0.9983. When a bit stream really FAILS BIG, you will get 322:
for 100000 random matrices, and a chi square test is performed on counts for ranks 6, 5 and ≤ 4.
882: 859: 819: 90: 62: 32: 24: 51:
An initial battery of randomness tests for RNGs was suggested in the 1969 first edition of
787: 706: 305: 247:≥ 2, for comparing the results to the Poisson distribution with that mean. This test uses 685:
s of 0 or 1 to six or more places. Since there are many tests, it is not unlikely that a
907:"The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness" 732:"The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness" 597:
s are virtually normal with a certain covariance matrix. A linear transformation of the
189:). Count ascending and descending runs. The counts should follow a certain distribution. 936: 925: 754: 364:
should be (very close to) normally distributed with mean 141,909 and sigma 428. Thus
871: 437:
should average 3523 with sigma 21.9 and is very close to normally distributed. Thus
834: 58: 61:(Volume 2, Chapter 3.3: Statistical tests). Knuth's tests were then supplanted by 39:
of random numbers. In 2006, the original diehard tests were extended into the
835:"An experimental exploration of Marsaglia's xorshift generators, scrambled" 69:
library, introduced in 2007 by Pierre L’Ecuyer and Richard Simard of the
105:
Select some number of bits from some number of random numbers to form a
823: 711: 66: 224:
is the number of values that occur more than once in that list, then
36: 863: 515:
Random integers are floated to get uniforms on [0,1). Starting with
914: 739: 471:
pairs of points. If the points are truly independent uniform, then
854: 196: 543:
provided by floating integers from the file being tested. Such
547:
s are found 100000 times, then counts for the number of times
328:
The file under test is viewed as a stream of bits. Call them b
569:(2), ... of uniform [0,1) variables. Then overlapping sums, 555:
are used to provide a chi-square test for cell frequencies.
669:
is the assumed distribution of the sample random variable
605:-values from ten KSTESTs are given still another KSTEST. 455:= 8000 random points in a square of side 10000. Find 308:
is performed on counts for ranks 31, 30, 29 and ≤ 28.
35:
over several years and first published in 1995 on a
220:days. List the spacings between the birthdays. If 811:Acta Mathematica Academiae Scientiarum Hungaricae 523:, the number of iterations necessary to reduce 429:, the number of cars successfully parked after 243:. Experience shows n must be quite large, say 185:Generate a long sequence of random floats on ( 173:Generate a long sequence of random floats on ( 808:(1953). "On the theory of order statistics". 255:= 2, so that the underlying distribution for 8: 390:The count-the-1's test on a stream of bytes 409:on counts for 5- and 4-letter cell counts. 398:on counts for 5- and 4-letter cell counts. 853: 842:ACM Transactions on Mathematical Software 401:The count-the-1's test for specific bytes 380:value. The test is repeated twenty times. 881:O'Neill, Melissa E. (5 September 2014). 433:= 12000 attempts. Simulation shows that 926:"Dieharder: A Random Number Test Suite" 723: 561:Integers are floated to get a sequence 311:The binary rank test for 32×32 matrices 297:The binary rank test for 31×31 matrices 784:"Robert G. Brown's General Tools Page" 645:Most of the tests in DIEHARD return a 287:-values provide a sample for a KSTEST. 376:score) that leads to a uniform [0,1) 372:should be a standard normal variate ( 318:The binary rank test for 6×8 matrices 7: 459:, the minimum distance between the 505:, then a KSTEST is done on the 20 290:The overlapping 5-permutation test 14: 673:– often normal. But that assumed 519:= 2 = 2147483648, the test finds 420:: the number of attempts, versus 163:Multiply 2 by random floats on ( 833:Vigna, Sebastiano (July 2016). 272:chi-square goodness of fit test 113:of the matrix. Count the ranks. 109:over {0,1}, then determine the 54:The Art of Computer Programming 27:for measuring the quality of a 451:It does this 100 times choose 315:for ranks 32, 31, 30 and ≤ 29. 1: 383:The tests OPSO, OQSO and DNA 259:is taken to be Poisson with 16:Battery of statistical tests 593:(101), ... are formed. The 425:random experiment is used: 89:. The name is based on the 959: 527:to 1, using the reduction 209:The birthday spacings test 558:The overlapping sums test 448:The minimum distance test 148:exponentially distributed 87:exponentially distributed 31:. They were developed by 943:Random number generation 911:Florida State University 736:Florida State University 653:-values are obtained by 96:Overlapping permutations 216:birthdays in a year of 121:infinite monkey theorem 29:random number generator 913:. 1995. Archived from 738:. 1995. Archived from 71:Université de Montréal 553:≤ 6, 7, ..., 47, ≥ 48 283:-value, and the nine 195:Play 200000 games of 170:Overlapping sums test 143:Minimum distance test 890:(Technical report). 412:The parking lot test 370:−141909) / 428 150:with a certain mean. 894:. HMC-CS-2014-0905. 892:Harvey Mudd College 489:The 3D spheres test 230:Poisson-distributed 153:Random spheres test 138:normal distribution 824:10.1007/BF02127580 325:The bitstream test 270:s is taken, and a 266:. A sample of 500 228:is asymptotically 924:Robert G. Brown. 773:Renyi, 1953, p194 753:Brown, Robert G. 407:(OBS − EXP) / EXP 348:, the second is b 204:Test descriptions 102:Ranks of matrices 82:Birthday spacings 25:statistical tests 23:are a battery of 950: 929: 918: 895: 889: 875: 857: 839: 827: 818:(3–4): 191–231. 792: 791: 786:. Archived from 780: 774: 771: 765: 764: 762: 761: 750: 744: 743: 728: 640: 633: 554: 550: 546: 522: 512:The squeeze test 504: 496: 484: 476: 470: 444: 408: 397: 371: 369: 363: 269: 265: 258: 242: 227: 223: 188: 176: 166: 160:The squeeze test 132:Parking lot test 91:birthday paradox 63:George Marsaglia 42: 33:George Marsaglia 958: 957: 953: 952: 951: 949: 948: 947: 933: 932: 923: 905: 902: 887: 880: 864:10.1145/2845077 837: 832: 804: 801: 799:Further reading 796: 795: 782: 781: 777: 772: 768: 759: 757: 752: 751: 747: 730: 729: 725: 720: 707:Randomness test 703: 635: 623: 552: 548: 544: 520: 498: 494: 478: 472: 460: 438: 406: 396:(OBS-EXP) / EXP 395: 367: 365: 361: 359: 355: 351: 347: 343: 339: 335: 331: 306:chi-square test 267: 260: 256: 233: 225: 221: 206: 186: 174: 164: 79: 49: 40: 17: 12: 11: 5: 956: 954: 946: 945: 935: 934: 931: 930: 920: 919: 917:on 2016-01-25. 901: 900:External links 898: 897: 896: 877: 876: 829: 828: 800: 797: 794: 793: 790:on 2017-07-03. 775: 766: 745: 742:on 2016-01-25. 722: 721: 719: 716: 715: 714: 709: 702: 699: 689:< 0.025 or 643: 642: 616: 615:The craps test 613: 609: 606: 559: 556: 513: 510: 490: 487: 449: 446: 443:− 3523) / 21.9 413: 410: 402: 399: 391: 388: 384: 381: 357: 353: 349: 345: 341: 337: 333: 329: 326: 323: 319: 316: 312: 309: 298: 295: 291: 288: 210: 205: 202: 201: 200: 193: 192:The craps test 190: 183: 178: 171: 168: 161: 158: 154: 151: 144: 141: 133: 130: 127: 124: 117: 114: 103: 100: 97: 94: 83: 78: 75: 48: 45: 15: 13: 10: 9: 6: 4: 3: 2: 955: 944: 941: 940: 938: 927: 922: 921: 916: 912: 908: 904: 903: 899: 893: 886: 885: 879: 878: 873: 869: 865: 861: 856: 851: 847: 843: 836: 831: 830: 825: 821: 817: 813: 812: 807: 806:Rényi, Alfréd 803: 802: 798: 789: 785: 779: 776: 770: 767: 756: 749: 746: 741: 737: 733: 727: 724: 717: 713: 710: 708: 705: 704: 700: 698: 696: 692: 688: 684: 680: 676: 672: 668: 664: 660: 656: 652: 648: 638: 631: 627: 622:and variance 621: 617: 614: 610: 608:The runs test 607: 604: 600: 596: 592: 588: 584: 580: 576: 572: 568: 564: 560: 557: 542: 538: 534: 530: 526: 518: 514: 511: 508: 502: 491: 488: 482: 475: 468: 464: 458: 454: 450: 447: 442: 436: 432: 428: 423: 419: 414: 411: 403: 400: 392: 389: 385: 382: 379: 375: 327: 324: 320: 317: 313: 310: 307: 303: 299: 296: 292: 289: 286: 282: 277: 273: 263: 254: 250: 246: 240: 236: 231: 219: 215: 211: 208: 207: 203: 198: 194: 191: 184: 182: 179: 172: 169: 162: 159: 155: 152: 149: 145: 142: 139: 134: 131: 128: 125: 122: 118: 115: 112: 108: 104: 101: 98: 95: 92: 88: 84: 81: 80: 77:Test overview 76: 74: 72: 68: 64: 60: 56: 55: 46: 44: 38: 34: 30: 26: 22: 21:diehard tests 915:the original 883: 845: 841: 815: 809: 788:the original 778: 769: 758:. Retrieved 748: 740:the original 726: 694: 690: 686: 682: 678: 674: 670: 666: 662: 658: 654: 650: 646: 644: 636: 629: 625: 619: 602: 598: 594: 590: 589:(2) + ... + 586: 582: 578: 577:(1) + ... + 574: 570: 566: 562: 540: 536: 532: 528: 524: 516: 506: 500: 480: 473: 466: 462: 456: 452: 440: 434: 430: 426: 421: 417: 377: 373: 284: 280: 275: 261: 252: 248: 244: 238: 234: 217: 213: 126:Count the 1s 116:Monkey tests 59:Donald Knuth 52: 50: 20: 18: 755:"dieharder" 639:= 244 / 495 274:provides a 264:= 2 / 2 = 2 760:2023-09-25 718:References 531:= ceiling( 232:with mean 855:1402.6246 848:(4): 30. 665:), where 499:1 − exp(− 479:1 − exp(− 181:Runs test 41:dieharder 937:Category 872:13936073 701:See also 539:), with 509:-values. 483:/ 0.995) 251:= 2 and 712:TestU01 634:, with 581:(100), 486:square. 212:Choose 67:TestU01 47:History 43:tests. 870:  624:200000 585:(2) = 573:(1) = 107:matrix 37:CD-ROM 888:(PDF) 868:S2CID 850:arXiv 838:(PDF) 628:(1 − 565:(1), 503:/ 30) 469:) / 2 302:field 197:craps 157:mean. 551:was 356:...b 344:...b 237:/ (4 111:rank 19:The 860:doi 820:doi 493:120 332:, b 187:0,1 175:0,1 165:0,1 57:by 939:: 909:. 866:. 858:. 846:42 844:. 840:. 814:. 734:. 657:= 465:− 358:21 346:20 73:. 928:. 874:. 862:: 852:: 826:. 822:: 816:4 763:. 695:p 691:p 687:p 683:p 679:p 675:F 671:X 667:F 663:X 661:( 659:F 655:p 651:p 647:p 637:p 632:) 630:p 626:p 620:p 603:p 599:S 595:S 591:U 587:U 583:S 579:U 575:U 571:S 567:U 563:U 549:j 545:j 541:U 537:U 535:× 533:k 529:k 525:k 521:j 517:k 507:p 501:r 495:π 481:d 474:d 467:n 463:n 461:( 457:d 453:n 441:k 439:( 435:k 431:n 427:k 422:k 418:n 378:p 374:z 368:j 366:( 362:j 354:3 352:b 350:2 342:2 340:b 338:1 334:2 330:1 285:p 281:p 276:p 268:j 262:λ 257:j 253:m 249:n 245:n 241:) 239:n 235:m 226:j 222:j 218:n 214:m 140:. 123:. 93:.

Index

statistical tests
random number generator
George Marsaglia
CD-ROM
The Art of Computer Programming
Donald Knuth
George Marsaglia
TestU01
Université de Montréal
exponentially distributed
birthday paradox
matrix
rank
infinite monkey theorem
normal distribution
exponentially distributed
Runs test
craps
Poisson-distributed
chi-square goodness of fit test
field
chi-square test
Randomness test
TestU01
"The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness"
Florida State University
the original
"dieharder"
"Robert G. Brown's General Tools Page"
the original

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.