Knowledge

Counting Bloom filter

Source 📝

377: 531: 770: 699: 195:
counters for a given input. This will introduce the probability of false negatives during a query if the deleted input has not previously been inserted into the filter. Guo
725: 240: 667: 628: 595: 407: 540:
in counting Bloom filter. However, following the assumption in Bloom filter, above probability is defined as false positive of counting Bloom filter. If
43:
are not – in other words, a query returns either "possibly bigger or equal than the threshold" or "definitely smaller than the threshold".
920: 828:
Tarkoma, Sasu; Rothenberg, Christian Esteve; Lagerspetz, Eemil (2012). "Theory and Practice of Bloom Filters for Distributed Systems".
730: 40: 36: 837: 672: 28: 180: 145:– if it were more and equal, then all the corresponding counters would have been greater or equal to 842: 704: 398:
Therefore, the probability that counting Bloom filter determines an element is greater or equal to
372:{\displaystyle b(l,kn,{\frac {1}{m}})={kn \choose l}({\frac {1}{m}})^{l}(1-{\frac {1}{m}})^{kn-l}} 231:
balls are inserted randomly. So the probability of one of counter in counting Bloom filter counts
191:
Several implementations of counting bloom filters allow for deletion, by decrementing each of the
863: 810: 526:{\displaystyle p_{fp}(\theta ,k,n,m)=(1-\sum \limits _{l<\theta }b(l,kn,{\frac {1}{m}}))^{k}} 386:
is binomial distribution. A counting Bloom filter determines an element is greater or equal to
855: 80: 639: 600: 567: 894: 847: 802: 87:
counter array positions, generating a uniform random distribution. It is also similar that
537: 166: 24: 914: 76: 21: 867: 223:, which hash functions make insertions uniform random, is also assumed here. In the 851: 814: 220: 173: 52: 32: 899: 882: 199:
present the problem in great detail, and provide heuristics for the parameters
859: 806: 137:
counter positions. If any of the counters at these positions is less than
59:
is the number of counters in counting Bloom filter, which is expansion of
633: 161:. If all are greater or equal to θ even though the count is smaller than 790: 31:
in a sequence exceeds a given threshold. As a generalized form of the
881:
Lee, Sunggu; Lee, Youngjoo; Jeong, Yongjo; Kim, Kibeom (July 2019).
71:
counters, all set to 0. Similar to Bloom filter, there must also be
179:
A counting Bloom filter is essentially the same data structure as
27:
that is used to test whether the number of occurrences of a given
98:
The main generalization of Bloom filter is adding an element. To
883:"Analysis of Counting Bloom Filters Used for Count Thresholding" 95:, which is proportional to the number of elements to be added. 125:(test whether the count number of an element is smaller than 157:, or the counters have by chance been greater or equal to 789:
Deke Guo; Yunhao Liu; Xiangyang Li; Panlong Yang (2010).
544:=1, the equation becomes false positive of Bloom filter. 153:, then either the count is really greater or equal to 141:, the count number of element is definitely less than 733: 707: 675: 642: 603: 570: 410: 243: 765:{\displaystyle {\frac {m}{n}}(0.2037\theta +0.9176)} 795:
IEEE Transactions on Knowledge and Data Engineering
211:which minimize the probability of false negatives. 169:. This also should be minimized like Bloom filter. 764: 719: 693: 661: 622: 589: 525: 371: 791:"False Negative Problem of Counting Bloom Filter" 302: 284: 727:they suggested using the floor or ceiling of 51:Most of the parameters are defined same with 8: 536:This is different from formal definition of 830:IEEE Communications Surveys & Tutorials 172:About hashing problem and advantages, see 898: 841: 734: 732: 706: 674: 647: 641: 608: 602: 575: 569: 517: 500: 467: 415: 409: 354: 340: 325: 311: 301: 283: 281: 265: 242: 83:or hashes some set element to one of the 781: 114:the counters 1 at all these positions. 7: 694:{\displaystyle 1\leq \theta \leq 30} 560:, the false positive decreases from 464: 102:an element, feed it to each of the 288: 165:, this circumstance is defined as 14: 394:counters are greater or equal to 149:. If all are greater or equal to 91:is a constant, much smaller than 548:Optimal number of hash functions 121:for an element with a threshold 852:10.1109/surv.2011.031611.00024 759: 744: 514: 510: 482: 454: 448: 424: 351: 331: 322: 308: 275: 247: 215:Probability of false positives 1: 720:{\displaystyle \theta >30} 187:Potential for false negatives 183:, but are used differently. 65:empty counting Bloom filter 937: 921:Hash-based data structures 900:10.3390/electronics8070779 636:shows numerical values of 129:), feed it to each of the 39:matches are possible, but 63:bits in Bloom filter. An 219:The same assumptions in 662:{\displaystyle k_{opt}} 623:{\displaystyle k_{opt}} 590:{\displaystyle k_{opt}} 390:when the corresponding 79:defined, each of which 766: 721: 695: 663: 630:to positive infinity. 624: 591: 564:=1 to a point defined 527: 373: 133:hash functions to get 106:hash functions to get 807:10.1109/TKDE.2009.209 767: 722: 696: 664: 625: 597:, and increases from 592: 528: 374: 47:Algorithm description 18:counting Bloom filter 731: 705: 673: 640: 601: 568: 552:For large but fixed 408: 241: 110:array positions and 762: 717: 691: 659: 620: 587: 523: 478: 369: 181:count–min sketches 742: 634:Kim et al. (2019) 508: 463: 348: 319: 300: 273: 928: 905: 904: 902: 878: 872: 871: 845: 825: 819: 818: 786: 771: 769: 768: 763: 743: 735: 726: 724: 723: 718: 700: 698: 697: 692: 668: 666: 665: 660: 658: 657: 629: 627: 626: 621: 619: 618: 596: 594: 593: 588: 586: 585: 532: 530: 529: 524: 522: 521: 509: 501: 477: 423: 422: 378: 376: 375: 370: 368: 367: 349: 341: 330: 329: 320: 312: 307: 306: 305: 296: 287: 274: 266: 936: 935: 931: 930: 929: 927: 926: 925: 911: 910: 909: 908: 880: 879: 875: 843:10.1.1.457.4228 827: 826: 822: 788: 787: 783: 778: 729: 728: 703: 702: 671: 670: 643: 638: 637: 604: 599: 598: 571: 566: 565: 550: 513: 411: 406: 405: 350: 321: 289: 282: 239: 238: 217: 189: 49: 41:false negatives 12: 11: 5: 934: 932: 924: 923: 913: 912: 907: 906: 873: 836:(1): 131–155. 820: 801:(5): 651–664. 780: 779: 777: 774: 761: 758: 755: 752: 749: 746: 741: 738: 716: 713: 710: 690: 687: 684: 681: 678: 656: 653: 650: 646: 617: 614: 611: 607: 584: 581: 578: 574: 549: 546: 538:false positive 520: 516: 512: 507: 504: 499: 496: 493: 490: 487: 484: 481: 476: 473: 470: 466: 462: 459: 456: 453: 450: 447: 444: 441: 438: 435: 432: 429: 426: 421: 418: 414: 366: 363: 360: 357: 353: 347: 344: 339: 336: 333: 328: 324: 318: 315: 310: 304: 299: 295: 292: 286: 280: 277: 272: 269: 264: 261: 258: 255: 252: 249: 246: 216: 213: 188: 185: 167:false positive 77:hash functions 48: 45: 37:false positive 25:data structure 13: 10: 9: 6: 4: 3: 2: 933: 922: 919: 918: 916: 901: 896: 892: 888: 884: 877: 874: 869: 865: 861: 857: 853: 849: 844: 839: 835: 831: 824: 821: 816: 812: 808: 804: 800: 796: 792: 785: 782: 775: 773: 756: 753: 750: 747: 739: 736: 714: 711: 708: 688: 685: 682: 679: 676: 654: 651: 648: 644: 635: 631: 615: 612: 609: 605: 582: 579: 576: 572: 563: 559: 555: 547: 545: 543: 539: 534: 518: 505: 502: 497: 494: 491: 488: 485: 479: 474: 471: 468: 460: 457: 451: 445: 442: 439: 436: 433: 430: 427: 419: 416: 412: 403: 401: 397: 393: 389: 385: 380: 364: 361: 358: 355: 345: 342: 337: 334: 326: 316: 313: 297: 293: 290: 278: 270: 267: 262: 259: 256: 253: 250: 244: 236: 234: 230: 226: 222: 214: 212: 210: 206: 202: 198: 194: 186: 184: 182: 177: 175: 170: 168: 164: 160: 156: 152: 148: 144: 140: 136: 132: 128: 124: 120: 115: 113: 109: 105: 101: 96: 94: 90: 86: 82: 78: 74: 70: 66: 62: 58: 54: 46: 44: 42: 38: 34: 30: 26: 23: 22:probabilistic 19: 890: 886: 876: 833: 829: 823: 798: 794: 784: 632: 561: 557: 553: 551: 541: 535: 404: 399: 395: 391: 387: 383: 381: 237: 232: 228: 224: 221:Bloom filter 218: 208: 204: 200: 196: 192: 190: 178: 174:Bloom filter 171: 162: 158: 154: 150: 146: 142: 138: 134: 130: 126: 122: 118: 116: 111: 107: 103: 99: 97: 92: 88: 84: 72: 68: 64: 60: 56: 53:Bloom filter 50: 33:Bloom filter 17: 15: 887:Electronics 893:(7): 779. 776:References 75:different 55:, such as 860:1553-877X 838:CiteSeerX 751:θ 709:θ 686:≤ 683:θ 680:≤ 475:θ 465:∑ 461:− 428:θ 362:− 338:− 112:increment 915:Category 868:17216682 815:3934679 669:within 57:m, k. m 29:element 866:  858:  840:  813:  757:0.9176 748:0.2037 701:. For 382:where 227:pots, 207:, and 197:et al. 864:S2CID 811:S2CID 119:query 67:is a 20:is a 856:ISSN 712:> 556:and 472:< 81:maps 895:doi 848:doi 803:doi 402:is 235:is 117:To 100:add 917:: 889:. 885:. 862:. 854:. 846:. 834:14 832:. 809:. 799:22 797:. 793:. 772:. 715:30 689:30 533:. 396:θ. 379:, 229:kn 203:, 176:. 35:, 16:A 903:. 897:: 891:8 870:. 850:: 817:. 805:: 760:) 754:+ 745:( 740:n 737:m 677:1 655:t 652:p 649:o 645:k 616:t 613:p 610:o 606:k 583:t 580:p 577:o 573:k 562:k 558:m 554:n 542:θ 519:k 515:) 511:) 506:m 503:1 498:, 495:n 492:k 489:, 486:l 483:( 480:b 469:l 458:1 455:( 452:= 449:) 446:m 443:, 440:n 437:, 434:k 431:, 425:( 420:p 417:f 413:p 400:θ 392:k 388:θ 384:b 365:l 359:n 356:k 352:) 346:m 343:1 335:1 332:( 327:l 323:) 317:m 314:1 309:( 303:) 298:l 294:n 291:k 285:( 279:= 276:) 271:m 268:1 263:, 260:n 257:k 254:, 251:l 248:( 245:b 233:l 225:m 209:n 205:k 201:m 193:k 163:θ 159:θ 155:θ 151:θ 147:θ 143:θ 139:θ 135:k 131:k 127:θ 123:θ 108:k 104:k 93:m 89:k 85:m 73:k 69:m 61:m

Index

probabilistic
data structure
element
Bloom filter
false positive
false negatives
Bloom filter
hash functions
maps
false positive
Bloom filter
count–min sketches
Bloom filter
false positive
Kim et al. (2019)
"False Negative Problem of Counting Bloom Filter"
doi
10.1109/TKDE.2009.209
S2CID
3934679
CiteSeerX
10.1.1.457.4228
doi
10.1109/surv.2011.031611.00024
ISSN
1553-877X
S2CID
17216682
"Analysis of Counting Bloom Filters Used for Count Thresholding"
doi

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.