25:
2021:
efficient
Content-Defined Chunking approach. It uses a fast rolling Gear hash algorithm, skipping the minimum length, normalizing the chunk-size distribution, and last but not the least, rolling two bytes each time to speed up the CDC algorithm, which can achieve about 10X higher throughput than Rabin-based CDC approach.
2020:
The
Content-Defined Chunking algorithm needs to compute the hash value of a data stream byte by byte and split the data stream into chunks when the hash value meets a predefined value. However, comparing a string byte-by-byte will introduce the heavy computation overhead. FastCDC proposes a new and
1507:
One of the interesting use cases of the rolling hash function is that it can create dynamic, content-based chunks of a stream or file. This is especially useful when it is required to send only the changed chunks of a large file over a network: a simple byte addition at the front of the file would
2219:
Where Gear array is a pre-calculated hashing array. Here FastCDC uses Gear hashing algorithm which can calculate the rolling hashing results quickly and keep the uniform distribution of the hashing results as Rabin. Compared with the traditional Rabin hashing algorithm, it achieves a much faster
651:
Because it shares the same author as the Rabin–Karp string search algorithm, which is often explained with another, simpler rolling hash, and because this simpler rolling hash is also a polynomial, both rolling hashes are often mistaken for each other. The backup software
2016:
Chunking is a technique to divide a data stream into a set of blocks, also called chunks. Content-defined chunking (CDC) is a chunking technique in which the division of the data stream is not based on fixed chunk size, as in fixed-size chunking, but on its content.
1009:
75:
A few hash functions allow a rolling hash to be computed very quickly—the new hash value is rapidly calculated given only the old hash value, the old value removed from the window, and the new value added to the window—similar to the way a
2228:
All rolling hash functions can be computed in time linear in the number of characters and updated in constant time when characters are shifted by one position. In particular, computing the Rabin-Karp rolling hash of a string of length
648:). The hash is the result of the division of that polynomial by an irreducible polynomial over GF(2). It is possible to update a Rabin fingerprint using only the entering and the leaving byte, making it effectively a rolling hash.
1335:
277:
1675:
2220:
speed. Experiments suggest that it can generate nearly the same chunk size distribution in the much shorter time (about 1/10 of rabin-based chunking ) when segmenting the data stream.
1497:
472:
Removing and adding characters simply involves adding or subtracting the first or last term. Shifting all characters by one position to the left requires multiplying the entire sum
838:
1734:
1129:
1228:
353:
1547:
806:
1051:
743:
1960:
1371:
826:
603:
1818:
1406:
1156:
2615:
2305:
2276:
1898:
1869:
1769:
2006:
1452:
1904:
Shifting the window by one byte simply involves adding the new character to the sum and subtracting the oldest character (no longer in the window) from the sum.
1574:
2247:
1980:
1925:
1838:
1789:
1426:
1176:
1094:
1074:
767:
704:
623:
570:
550:
530:
510:
490:
463:
443:
423:
403:
380:
301:
1511:
A simple approach to making dynamic chunks is to calculate a rolling hash, and if the hash value matches an arbitrary pattern (e.g. all zeroes) in the lower
131:
92:
1549:, given the hash has a uniform probability distribution) then it’s chosen to be a chunk boundary. Each chunk will thus have an average size of
2530:
Xia, Wen; Jiang, Hong; Feng, Dan; Tian, Lei; Fu, Min; Zhou, Yukun (2014). "Ddelta: A deduplication-inspired fast delta compression approach".
1236:
644:. Instead of seeing the input as a polynomial of bytes, it is seen as a polynomial of bits, and all arithmetic is done in GF(2) (similarly to
140:
678:
Hashing by cyclic polynomial—sometimes called
Buzhash—is also simple and it has the benefit of avoiding multiplications, using
2008:. This approach will ensure that any change in the file will only affect its current and possibly the next chunk, but no other chunk.
2488:
1579:
Once the boundaries are known, the chunks need to be compared by cryptographic hash value to detect changes. The backup software
1576:
bytes. This approach ensures that unmodified data (more than a window size away from the changes) will have the same boundaries.
2568:
Xia, Wen; Zou, Xiangyu; Jiang, Hong; Zhou, Yukun; Liu, Chuanyi; Feng, Dan; Hua, Yu; Hu, Yuchong; Zhang, Yucheng (2020-06-16).
2441:
573:
107:
as its rolling hash. FastCDC (Fast
Content-Defined Chunking) uses a compute-efficient Gear fingerprint as its rolling hash.
466:
2371:
1608:
1508:
normally cause all fixed size windows to become updated, while in reality, only the first "chunk" has been modified.
2424:
1457:
40:
2478:
1004:{\displaystyle H=s^{k-1}(h(c_{1}))\oplus s^{k-2}(h(c_{2}))\oplus \ldots \oplus s(h(c_{k-1}))\oplus h(c_{k}),}
2630:
2357:-gram hashing is pairwise independent, at best, Computer Speech & Language 24 (4), pages 698–710, 2010.
1580:
1014:
where the multiplications by powers of two can be implemented by binary shifts. The result is a number in
1683:
34:
119:
2477:
Xia, Wen; Zhou, Yukun; Jiang, Hong; Feng, Dan; Hua, Yu; Hu, Yuchong; Liu, Qing; Zhang, Yucheng (2005).
1099:
1181:
306:
111:
1376:
Hashing by cyclic polynomials is strongly universal or pairwise independent: simply keep the first
2589:
2504:
1602:
option) and rsyncrypto, do content-based slicing based on this specific (unweighted) moving sum:
1587:
1518:
683:
383:
80:
function can be computed much more quickly than other low-pass filters; and similar to the way a
776:
1017:
709:
134:
is often explained using a rolling hash function that only uses multiplications and additions:
2547:
2484:
634:
356:
115:
104:
45:
2410:
1930:
1343:
811:
773:): it rotates the bits by 1 to the left, pushing the latest bit in the first position. E.g.,
2581:
2539:
1583:
uses the
Buzhash algorithm with a customizable chunk size range for splitting file streams.
653:
625:
can be multiplied to get the result of the division without actually performing a division.
578:
1796:
1379:
1134:
2570:"The Design of Fast Content-Defined Chunking for Data Deduplication Based Storage Systems"
2516:
2464:
2281:
2252:
1874:
1845:
1745:
85:
81:
1985:
1431:
512:. Shifting all characters by one position to the right requires dividing the entire sum
2312:
2232:
1965:
1910:
1823:
1774:
1552:
1411:
1161:
1079:
1059:
770:
752:
689:
679:
608:
555:
535:
515:
495:
475:
448:
428:
408:
388:
365:
286:
77:
2480:
FastCDC: A Fast and
Efficient Content-Defined Chunking Approach for Data Deduplication
2442:"Data structures and file formats — Borg - Deduplicating Archiver 1.1.5 documentation"
2624:
2593:
69:
2308:
829:
638:
95:, which uses the rolling hash described below. Another popular application is the
2330:
2569:
2543:
637:
is another hash, which also interprets the input as a polynomial, but over the
2585:
746:
2551:
656:
uses a Rabin fingerprint for splitting files, with blob size varying between
2358:
1454:
consecutive bits. In practice, this can be achieved by an integer division
2278:
modular arithmetic operations, and hashing by cyclic polynomials requires
100:
2425:"Rabin Karp rolling hash - dynamic sized chunks based on hashed content"
1330:{\displaystyle H\leftarrow s(H)\oplus s^{k}(h(c_{1}))\oplus h(c_{k+1}),}
2616:
MIT 6.006: Introduction to
Algorithms 2011- Recitation 9 - Rolling Hash
2324:
1056:
Computing the hash values in a rolling fashion is done as follows. Let
272:{\displaystyle H=c_{1}a^{k-1}+c_{2}a^{k-2}+c_{3}a^{k-3}+...+c_{k}a^{0}}
2563:
2561:
103:
as its rolling hash. Low
Bandwidth Network Filesystem (LBFS) uses a
72:
where the input is hashed in a window that moves through the input.
645:
641:
96:
18:
2012:
Gear fingerprint and content-based chunking algorithm FastCDC
445:
is critical to get good hashing; in particular, the modulus
68:(also known as recursive hashing or rolling checksum) is a
2411:"Foundation - Introducing Content Defined Chunking (CDC)"
2057:← 64KB // split maximum chunk size is 64 KB
2053:← 2KB // split minimum chunk size is 2 KB
749:
mapping characters to random integers. Let the function
1871:
is a "hash value" consisting of the bottom 12 bits of
1771:
is the sum of 8196 consecutive bytes ending with byte
1555:
1521:
2574:
IEEE Transactions on
Parallel and Distributed Systems
2284:
2255:
2235:
2024:
The basic version pseudocode is provided as follows:
1988:
1968:
1933:
1913:
1877:
1848:
1826:
1799:
1777:
1748:
1686:
1611:
1460:
1434:
1414:
1382:
1346:
1239:
1184:
1164:
1137:
1102:
1082:
1062:
1020:
841:
814:
779:
755:
712:
692:
611:
581:
558:
538:
518:
498:
478:
451:
431:
411:
391:
368:
355:
are the input characters (but this function is not a
309:
289:
143:
99:
program, which uses a checksum based on Mark Adler's
2394:Jonathan D. Cohen, Recursive Hashing Functions for
2079:// buffer size is less than minimum chunk size
745:. This hash function might be simply an array or a
32:It has been suggested that this article should be
2299:
2270:
2241:
2000:
1974:
1954:
1919:
1892:
1863:
1832:
1812:
1783:
1763:
1728:
1669:
1568:
1541:
1491:
1446:
1420:
1400:
1365:
1329:
1222:
1170:
1150:
1123:
1088:
1068:
1045:
1003:
820:
800:
761:
737:
698:
617:
597:
564:
544:
524:
504:
484:
457:
437:
417:
397:
374:
347:
295:
271:
1586:Such content-defined chunking is often used for
1670:{\displaystyle S(n)=\sum _{i=n-8195}^{n}c_{i},}
686:: it presumes that there is some hash function
2349:
2347:
2345:
8:
2436:
2434:
2406:
2404:
2123:bytes, and kickstart the hash
706:from characters to integers in the interval
2398:-Grams, ACM Trans. Inf. Syst. 15 (3), 1997.
1598:Several programs, including gzip (with the
1503:Content-based slicing using a rolling hash
1492:{\displaystyle H\rightarrow H\div 2^{k-1}}
1158:is the character to be removed, rotate it
2372:"References — restic 0.9.0 documentation"
2283:
2254:
2234:
1987:
1967:
1932:
1912:
1876:
1847:
1825:
1804:
1798:
1776:
1747:
1719:
1718:
1685:
1658:
1648:
1631:
1610:
1560:
1554:
1531:
1522:
1520:
1477:
1459:
1433:
1413:
1381:
1351:
1345:
1309:
1284:
1265:
1238:
1208:
1189:
1183:
1163:
1142:
1136:
1101:
1081:
1061:
1034:
1019:
989:
958:
921:
896:
877:
852:
840:
813:
778:
754:
726:
711:
691:
610:
586:
580:
557:
537:
517:
497:
477:
450:
430:
410:
390:
367:
339:
314:
308:
288:
263:
253:
222:
212:
193:
183:
164:
154:
142:
2341:
2512:
2502:
1962:, these programs cut the file between
1594:Content-based slicing using moving sum
2353:Daniel Lemire, Owen Kaser: Recursive
7:
362:In order to avoid manipulating huge
91:One of the main applications is the
1729:{\displaystyle H(n)=S(n)\mod 4096,}
1076:be the previous hash value. Rotate
2423:Horvath, Adam (October 24, 2012).
552:. Note that in modulo arithmetic,
465:is typically a prime number. See
132:Rabin–Karp string search algorithm
93:Rabin–Karp string search algorithm
14:
832:. The hash values are defined as
110:At best, rolling hash values are
1124:{\displaystyle H\leftarrow s(H)}
769:be a cyclic binary rotation (or
23:
1714:
1408:bits. That is, take the result
1223:{\displaystyle s^{k}(h(c_{1}))}
348:{\displaystyle c_{1},...,c_{k}}
2294:
2288:
2265:
2259:
1943:
1937:
1887:
1881:
1858:
1852:
1791:(requires 21 bits of storage),
1758:
1752:
1711:
1705:
1696:
1690:
1621:
1615:
1464:
1321:
1302:
1293:
1290:
1277:
1271:
1255:
1249:
1243:
1217:
1214:
1201:
1195:
1118:
1112:
1106:
1040:
1021:
995:
982:
973:
970:
951:
945:
930:
927:
914:
908:
886:
883:
870:
864:
789:
783:
732:
713:
1:
2327: – Data mining technique
467:linear congruential generator
1542:{\textstyle {1 \over 2^{n}}}
1515:bits (with a probability of
2647:
2544:10.1016/j.peva.2014.07.016
2038:, data length
801:{\displaystyle s(101)=011}
38:into a new article titled
2586:10.1109/TPDS.2020.2984632
2446:borgbackup.readthedocs.io
1046:{\displaystyle [0,2^{L})}
738:{\displaystyle [0,2^{L})}
682:instead. It is a form of
382:values, all math is done
88:from the old hash value.
2224:Computational complexity
572:can be chosen to have a
41:Content-Defined Chunking
1955:{\displaystyle H(n)==0}
1366:{\displaystyle c_{k+1}}
821:{\displaystyle \oplus }
126:Polynomial rolling hash
2532:Performance Evaluation
2465:"Rsyncrypto Algorithm"
2301:
2272:
2243:
2210:+ 1
2002:
1976:
1956:
1921:
1894:
1865:
1834:
1814:
1785:
1765:
1730:
1671:
1653:
1570:
1543:
1493:
1448:
1422:
1402:
1373:is the new character.
1367:
1331:
1224:
1172:
1152:
1125:
1090:
1070:
1047:
1005:
822:
802:
763:
739:
700:
619:
599:
598:{\displaystyle a^{-1}}
574:multiplicative inverse
566:
546:
526:
506:
486:
459:
439:
419:
399:
376:
349:
297:
273:
2376:restic.readthedocs.io
2302:
2273:
2244:
2003:
1977:
1957:
1922:
1895:
1866:
1835:
1815:
1813:{\displaystyle c_{i}}
1786:
1766:
1731:
1672:
1627:
1571:
1544:
1494:
1449:
1423:
1403:
1401:{\displaystyle L-k+1}
1368:
1332:
1225:
1173:
1153:
1151:{\displaystyle c_{1}}
1126:
1091:
1071:
1048:
1006:
823:
803:
764:
740:
701:
620:
600:
567:
547:
527:
507:
487:
469:for more discussion.
460:
440:
420:
400:
377:
350:
298:
274:
16:Type of hash function
2300:{\displaystyle O(k)}
2282:
2271:{\displaystyle O(k)}
2253:
2233:
2063:0x0000d93003530000LL
1986:
1966:
1931:
1911:
1893:{\displaystyle S(n)}
1875:
1864:{\displaystyle H(n)}
1846:
1824:
1797:
1775:
1764:{\displaystyle S(n)}
1746:
1684:
1609:
1553:
1519:
1458:
1432:
1412:
1380:
1344:
1237:
1182:
1162:
1135:
1100:
1080:
1060:
1018:
839:
812:
777:
753:
710:
690:
609:
579:
556:
536:
516:
496:
476:
449:
429:
409:
389:
366:
307:
287:
141:
112:pairwise independent
2001:{\displaystyle n+1}
1447:{\displaystyle k-1}
303:is a constant, and
2297:
2268:
2239:
2156:+ 1
2119:// Skip the first
1998:
1972:
1952:
1917:
1890:
1861:
1830:
1810:
1781:
1761:
1726:
1667:
1588:data deduplication
1569:{\textstyle 2^{n}}
1566:
1539:
1489:
1444:
1418:
1398:
1363:
1327:
1230:. Then simply set
1220:
1168:
1148:
1121:
1086:
1066:
1043:
1001:
818:
798:
759:
735:
696:
684:tabulation hashing
615:
595:
562:
542:
522:
502:
482:
455:
435:
415:
395:
372:
345:
293:
269:
120:3-wise independent
2242:{\displaystyle k}
1975:{\displaystyle n}
1920:{\displaystyle n}
1833:{\displaystyle i}
1784:{\displaystyle n}
1537:
1421:{\displaystyle H}
1171:{\displaystyle k}
1089:{\displaystyle H}
1069:{\displaystyle H}
762:{\displaystyle s}
699:{\displaystyle h}
674:Cyclic polynomial
635:Rabin fingerprint
629:Rabin fingerprint
618:{\displaystyle H}
565:{\displaystyle a}
545:{\displaystyle a}
525:{\displaystyle H}
505:{\displaystyle a}
485:{\displaystyle H}
458:{\displaystyle n}
438:{\displaystyle n}
418:{\displaystyle a}
398:{\displaystyle n}
375:{\displaystyle H}
357:Rabin fingerprint
296:{\displaystyle a}
118:. They cannot be
105:Rabin fingerprint
62:
61:
2638:
2604:
2603:
2601:
2600:
2580:(9): 2017–2031.
2565:
2556:
2555:
2527:
2521:
2520:
2514:
2510:
2508:
2500:
2498:
2497:
2474:
2468:
2462:
2456:
2455:
2453:
2452:
2438:
2429:
2428:
2420:
2414:
2408:
2399:
2392:
2386:
2385:
2383:
2382:
2368:
2362:
2351:
2306:
2304:
2303:
2298:
2277:
2275:
2274:
2269:
2248:
2246:
2245:
2240:
2007:
2005:
2004:
1999:
1981:
1979:
1978:
1973:
1961:
1959:
1958:
1953:
1926:
1924:
1923:
1918:
1899:
1897:
1896:
1891:
1870:
1868:
1867:
1862:
1839:
1837:
1836:
1831:
1819:
1817:
1816:
1811:
1809:
1808:
1790:
1788:
1787:
1782:
1770:
1768:
1767:
1762:
1735:
1733:
1732:
1727:
1676:
1674:
1673:
1668:
1663:
1662:
1652:
1647:
1601:
1575:
1573:
1572:
1567:
1565:
1564:
1548:
1546:
1545:
1540:
1538:
1536:
1535:
1523:
1498:
1496:
1495:
1490:
1488:
1487:
1453:
1451:
1450:
1445:
1428:and dismiss any
1427:
1425:
1424:
1419:
1407:
1405:
1404:
1399:
1372:
1370:
1369:
1364:
1362:
1361:
1336:
1334:
1333:
1328:
1320:
1319:
1289:
1288:
1270:
1269:
1229:
1227:
1226:
1221:
1213:
1212:
1194:
1193:
1177:
1175:
1174:
1169:
1157:
1155:
1154:
1149:
1147:
1146:
1130:
1128:
1127:
1122:
1095:
1093:
1092:
1087:
1075:
1073:
1072:
1067:
1052:
1050:
1049:
1044:
1039:
1038:
1010:
1008:
1007:
1002:
994:
993:
969:
968:
926:
925:
907:
906:
882:
881:
863:
862:
827:
825:
824:
819:
807:
805:
804:
799:
768:
766:
765:
760:
744:
742:
741:
736:
731:
730:
705:
703:
702:
697:
669:
668:
662:
661:
624:
622:
621:
616:
604:
602:
601:
596:
594:
593:
571:
569:
568:
563:
551:
549:
548:
543:
531:
529:
528:
523:
511:
509:
508:
503:
491:
489:
488:
483:
464:
462:
461:
456:
444:
442:
441:
436:
424:
422:
421:
416:
405:. The choice of
404:
402:
401:
396:
381:
379:
378:
373:
354:
352:
351:
346:
344:
343:
319:
318:
302:
300:
299:
294:
278:
276:
275:
270:
268:
267:
258:
257:
233:
232:
217:
216:
204:
203:
188:
187:
175:
174:
159:
158:
57:
54:
27:
26:
19:
2646:
2645:
2641:
2640:
2639:
2637:
2636:
2635:
2621:
2620:
2612:
2607:
2598:
2596:
2567:
2566:
2559:
2529:
2528:
2524:
2511:
2501:
2495:
2493:
2491:
2476:
2475:
2471:
2463:
2459:
2450:
2448:
2440:
2439:
2432:
2422:
2421:
2417:
2409:
2402:
2393:
2389:
2380:
2378:
2370:
2369:
2365:
2359:arXiv:0705.4676
2352:
2343:
2339:
2321:
2313:circular shifts
2280:
2279:
2251:
2250:
2231:
2230:
2226:
2217:
2177:<< 1 ) +
2144:<< 1 ) +
2014:
1984:
1983:
1964:
1963:
1929:
1928:
1909:
1908:
1873:
1872:
1844:
1843:
1822:
1821:
1800:
1795:
1794:
1773:
1772:
1744:
1743:
1682:
1681:
1654:
1607:
1606:
1599:
1596:
1556:
1551:
1550:
1527:
1517:
1516:
1505:
1473:
1456:
1455:
1430:
1429:
1410:
1409:
1378:
1377:
1347:
1342:
1341:
1305:
1280:
1261:
1235:
1234:
1204:
1185:
1180:
1179:
1160:
1159:
1138:
1133:
1132:
1098:
1097:
1078:
1077:
1058:
1057:
1030:
1016:
1015:
985:
954:
917:
892:
873:
848:
837:
836:
828:be the bitwise
810:
809:
775:
774:
751:
750:
722:
708:
707:
688:
687:
676:
666:
664:
659:
657:
631:
607:
606:
582:
577:
576:
554:
553:
534:
533:
514:
513:
494:
493:
474:
473:
447:
446:
427:
426:
407:
406:
387:
386:
364:
363:
335:
310:
305:
304:
285:
284:
259:
249:
218:
208:
189:
179:
160:
150:
139:
138:
128:
122:, for example.
86:rapidly updated
58:
52:
49:
28:
24:
17:
12:
11:
5:
2644:
2642:
2634:
2633:
2631:Hash functions
2623:
2622:
2619:
2618:
2611:
2610:External links
2608:
2606:
2605:
2557:
2522:
2513:|website=
2489:
2469:
2457:
2430:
2415:
2400:
2387:
2363:
2340:
2338:
2335:
2334:
2333:
2328:
2320:
2317:
2296:
2293:
2290:
2287:
2267:
2264:
2261:
2258:
2238:
2225:
2222:
2026:
2013:
2010:
1997:
1994:
1991:
1971:
1951:
1948:
1945:
1942:
1939:
1936:
1916:
1902:
1901:
1889:
1886:
1883:
1880:
1860:
1857:
1854:
1851:
1841:
1829:
1807:
1803:
1792:
1780:
1760:
1757:
1754:
1751:
1737:
1736:
1725:
1722:
1717:
1713:
1710:
1707:
1704:
1701:
1698:
1695:
1692:
1689:
1678:
1677:
1666:
1661:
1657:
1651:
1646:
1643:
1640:
1637:
1634:
1630:
1626:
1623:
1620:
1617:
1614:
1595:
1592:
1563:
1559:
1534:
1530:
1526:
1504:
1501:
1486:
1483:
1480:
1476:
1472:
1469:
1466:
1463:
1443:
1440:
1437:
1417:
1397:
1394:
1391:
1388:
1385:
1360:
1357:
1354:
1350:
1338:
1337:
1326:
1323:
1318:
1315:
1312:
1308:
1304:
1301:
1298:
1295:
1292:
1287:
1283:
1279:
1276:
1273:
1268:
1264:
1260:
1257:
1254:
1251:
1248:
1245:
1242:
1219:
1216:
1211:
1207:
1203:
1200:
1197:
1192:
1188:
1167:
1145:
1141:
1120:
1117:
1114:
1111:
1108:
1105:
1085:
1065:
1042:
1037:
1033:
1029:
1026:
1023:
1012:
1011:
1000:
997:
992:
988:
984:
981:
978:
975:
972:
967:
964:
961:
957:
953:
950:
947:
944:
941:
938:
935:
932:
929:
924:
920:
916:
913:
910:
905:
902:
899:
895:
891:
888:
885:
880:
876:
872:
869:
866:
861:
858:
855:
851:
847:
844:
817:
797:
794:
791:
788:
785:
782:
771:circular shift
758:
734:
729:
725:
721:
718:
715:
695:
675:
672:
630:
627:
614:
592:
589:
585:
561:
541:
521:
501:
481:
454:
434:
414:
394:
371:
359:, see below).
342:
338:
334:
331:
328:
325:
322:
317:
313:
292:
281:
280:
266:
262:
256:
252:
248:
245:
242:
239:
236:
231:
228:
225:
221:
215:
211:
207:
202:
199:
196:
192:
186:
182:
178:
173:
170:
167:
163:
157:
153:
149:
146:
127:
124:
78:moving average
60:
59:
31:
29:
22:
15:
13:
10:
9:
6:
4:
3:
2:
2643:
2632:
2629:
2628:
2626:
2617:
2614:
2613:
2609:
2595:
2591:
2587:
2583:
2579:
2575:
2571:
2564:
2562:
2558:
2553:
2549:
2545:
2541:
2537:
2533:
2526:
2523:
2518:
2506:
2492:
2490:9781931971300
2486:
2482:
2481:
2473:
2470:
2466:
2461:
2458:
2447:
2443:
2437:
2435:
2431:
2426:
2419:
2416:
2412:
2407:
2405:
2401:
2397:
2391:
2388:
2377:
2373:
2367:
2364:
2360:
2356:
2350:
2348:
2346:
2342:
2336:
2332:
2329:
2326:
2323:
2322:
2318:
2316:
2314:
2310:
2309:exclusive ors
2291:
2285:
2262:
2256:
2236:
2223:
2221:
2216:
2213:
2209:
2205:
2202:
2199:
2196:
2192:
2188:
2184:
2180:
2176:
2172:
2169:
2166:
2162:
2159:
2155:
2151:
2147:
2143:
2139:
2136:
2133:
2129:
2126:
2122:
2118:
2114:
2111:
2108:
2104:
2101:
2098:
2095:
2092:
2089:
2085:
2082:
2078:
2074:
2071:
2067:
2064:
2060:
2056:
2052:
2049:
2045:
2041:
2037:
2033:
2029:
2025:
2022:
2018:
2011:
2009:
1995:
1992:
1989:
1969:
1949:
1946:
1940:
1934:
1914:
1905:
1884:
1878:
1855:
1849:
1842:
1827:
1805:
1801:
1793:
1778:
1755:
1749:
1742:
1741:
1740:
1723:
1720:
1715:
1708:
1702:
1699:
1693:
1687:
1680:
1679:
1664:
1659:
1655:
1649:
1644:
1641:
1638:
1635:
1632:
1628:
1624:
1618:
1612:
1605:
1604:
1603:
1593:
1591:
1589:
1584:
1582:
1577:
1561:
1557:
1532:
1528:
1524:
1514:
1509:
1502:
1500:
1484:
1481:
1478:
1474:
1470:
1467:
1461:
1441:
1438:
1435:
1415:
1395:
1392:
1389:
1386:
1383:
1374:
1358:
1355:
1352:
1348:
1324:
1316:
1313:
1310:
1306:
1299:
1296:
1285:
1281:
1274:
1266:
1262:
1258:
1252:
1246:
1240:
1233:
1232:
1231:
1209:
1205:
1198:
1190:
1186:
1165:
1143:
1139:
1115:
1109:
1103:
1083:
1063:
1054:
1035:
1031:
1027:
1024:
998:
990:
986:
979:
976:
965:
962:
959:
955:
948:
942:
939:
936:
933:
922:
918:
911:
903:
900:
897:
893:
889:
878:
874:
867:
859:
856:
853:
849:
845:
842:
835:
834:
833:
831:
815:
795:
792:
786:
780:
772:
756:
748:
727:
723:
719:
716:
693:
685:
681:
680:barrel shifts
673:
671:
655:
649:
647:
643:
640:
636:
628:
626:
612:
590:
587:
583:
575:
559:
539:
519:
499:
479:
470:
468:
452:
432:
412:
392:
385:
369:
360:
358:
340:
336:
332:
329:
326:
323:
320:
315:
311:
290:
264:
260:
254:
250:
246:
243:
240:
237:
234:
229:
226:
223:
219:
213:
209:
205:
200:
197:
194:
190:
184:
180:
176:
171:
168:
165:
161:
155:
151:
147:
144:
137:
136:
135:
133:
125:
123:
121:
117:
113:
108:
106:
102:
98:
94:
89:
87:
83:
79:
73:
71:
70:hash function
67:
56:
47:
43:
42:
37:
36:
30:
21:
20:
2597:. Retrieved
2577:
2573:
2535:
2531:
2525:
2494:. Retrieved
2479:
2472:
2460:
2449:. Retrieved
2445:
2418:
2395:
2390:
2379:. Retrieved
2375:
2366:
2354:
2227:
2218:
2214:
2211:
2207:
2203:
2200:
2197:
2194:
2190:
2186:
2182:
2178:
2174:
2170:
2167:
2164:
2160:
2157:
2153:
2149:
2145:
2141:
2137:
2134:
2131:
2127:
2124:
2120:
2116:
2112:
2109:
2106:
2102:
2099:
2096:
2093:
2090:
2087:
2083:
2080:
2076:
2072:
2069:
2065:
2062:
2058:
2054:
2050:
2047:
2043:
2039:
2035:
2034:data buffer
2031:
2030:FastCDC
2027:
2023:
2019:
2015:
1906:
1903:
1840:of the file,
1738:
1597:
1585:
1578:
1512:
1510:
1506:
1375:
1339:
1055:
1013:
830:exclusive or
677:
650:
639:Galois field
632:
471:
361:
282:
129:
114:or strongly
109:
90:
82:Zobrist hash
74:
66:rolling hash
65:
63:
50:
39:
33:
2538:: 258–272.
2331:w-shingling
1600:--rsyncable
53:August 2020
2599:2020-07-22
2496:2020-07-24
2451:2018-05-24
2381:2018-05-24
2181:]
2148:]
2046:cut point
1907:For every
747:hash table
2594:215817722
2552:0166-5316
2515:ignored (
2505:cite book
2337:Footnotes
2249:requires
2173:← (
2140:← (
2028:algorithm
1642:−
1629:∑
1482:−
1471:÷
1465:→
1439:−
1387:−
1297:⊕
1259:⊕
1244:←
1107:←
977:⊕
963:−
940:⊕
937:…
934:⊕
901:−
890:⊕
857:−
816:⊕
605:by which
588:−
227:−
198:−
169:−
116:universal
2625:Category
2319:See also
2307:bitwise
2206:←
2185: !(
2152:←
2115:←
2075:←
2068:←
2061:←
1820:is byte
1178:times:
101:adler-32
2413:. 2015.
2325:MinHash
2132:MinSize
2121:MinSize
2117:MaxSize
2107:MaxSize
2088:MinSize
2055:MaxSize
2051:MinSize
2044:output:
2042:,
84:can be
46:discuss
2592:
2550:
2487:
2212:return
2198:return
2189:&
2094:return
2032:input:
1927:where
1739:where
1340:where
1131:. If
1096:once:
808:. Let
654:restic
384:modulo
283:where
2590:S2CID
2163:<
2158:while
2130:<
2125:while
646:CRC32
642:GF(2)
97:rsync
35:split
2548:ISSN
2517:help
2485:ISBN
2311:and
2195:then
2191:Mask
2179:Gear
2146:Gear
2110:then
2091:then
2059:Mask
1982:and
1721:4096
1645:8195
1581:Borg
663:and
633:The
425:and
130:The
2582:doi
2540:doi
2036:src
1716:mod
796:011
787:101
667:MiB
660:KiB
658:512
532:by
492:by
48:)
44:. (
2627::
2588:.
2578:31
2576:.
2572:.
2560:^
2546:.
2536:79
2534:.
2509::
2507:}}
2503:{{
2483:.
2444:.
2433:^
2403:^
2374:.
2344:^
2315:.
2193:)
2187:fp
2183:if
2175:fp
2171:fp
2168:do
2142:fp
2138:fp
2135:do
2105:≥
2100:if
2086:≤
2081:if
2066:fp
1947:==
1590:.
1499:.
1053:.
670:.
64:A
2602:.
2584::
2554:.
2542::
2519:)
2499:.
2467:.
2454:.
2427:.
2396:n
2384:.
2361:.
2355:n
2295:)
2292:k
2289:(
2286:O
2266:)
2263:k
2260:(
2257:O
2237:k
2215:i
2208:i
2204:i
2201:i
2165:n
2161:i
2154:i
2150:i
2128:i
2113:n
2103:n
2097:n
2084:n
2077:0
2073:i
2070:0
2048:i
2040:n
1996:1
1993:+
1990:n
1970:n
1950:0
1944:)
1941:n
1938:(
1935:H
1915:n
1900:.
1888:)
1885:n
1882:(
1879:S
1859:)
1856:n
1853:(
1850:H
1828:i
1806:i
1802:c
1779:n
1759:)
1756:n
1753:(
1750:S
1724:,
1712:)
1709:n
1706:(
1703:S
1700:=
1697:)
1694:n
1691:(
1688:H
1665:,
1660:i
1656:c
1650:n
1639:n
1636:=
1633:i
1625:=
1622:)
1619:n
1616:(
1613:S
1562:n
1558:2
1533:n
1529:2
1525:1
1513:N
1485:1
1479:k
1475:2
1468:H
1462:H
1442:1
1436:k
1416:H
1396:1
1393:+
1390:k
1384:L
1359:1
1356:+
1353:k
1349:c
1325:,
1322:)
1317:1
1314:+
1311:k
1307:c
1303:(
1300:h
1294:)
1291:)
1286:1
1282:c
1278:(
1275:h
1272:(
1267:k
1263:s
1256:)
1253:H
1250:(
1247:s
1241:H
1218:)
1215:)
1210:1
1206:c
1202:(
1199:h
1196:(
1191:k
1187:s
1166:k
1144:1
1140:c
1119:)
1116:H
1113:(
1110:s
1104:H
1084:H
1064:H
1041:)
1036:L
1032:2
1028:,
1025:0
1022:[
999:,
996:)
991:k
987:c
983:(
980:h
974:)
971:)
966:1
960:k
956:c
952:(
949:h
946:(
943:s
931:)
928:)
923:2
919:c
915:(
912:h
909:(
904:2
898:k
894:s
887:)
884:)
879:1
875:c
871:(
868:h
865:(
860:1
854:k
850:s
846:=
843:H
793:=
790:)
784:(
781:s
757:s
733:)
728:L
724:2
720:,
717:0
714:[
694:h
665:8
613:H
591:1
584:a
560:a
540:a
520:H
500:a
480:H
453:n
433:n
413:a
393:n
370:H
341:k
337:c
333:,
330:.
327:.
324:.
321:,
316:1
312:c
291:a
279:,
265:0
261:a
255:k
251:c
247:+
244:.
241:.
238:.
235:+
230:3
224:k
220:a
214:3
210:c
206:+
201:2
195:k
191:a
185:2
181:c
177:+
172:1
166:k
162:a
156:1
152:c
148:=
145:H
55:)
51:(
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.