Knowledge (XXG)

Approximate string matching

Source đź“ť

31: 868:
and off-line. With on-line algorithms the pattern can be processed before searching but the text cannot. In other words, on-line techniques do searching without an index. Early algorithms for on-line approximate matching were suggested by Wagner and Fischer and by Sellers. Both algorithms are based
249:
Other matchers specify the number of operations of each type separately, while still others set a total cost but allow different weights to be assigned to different operations. Some matchers permit separate assignments of limits and weights to individual groups in the pattern.
209:
Different approximate matchers impose different constraints. Some matchers use a single global unweighted cost, that is, the total number of primitive operations necessary to convert the match to the pattern. For example, if the pattern is
460:
A brute-force approach would be to compute the edit distance to P for all substrings of T, and then choose the substring with the minimum distance. However, this algorithm would have the running time
443: 374: 318: 844:) time with the dynamic programming algorithm cannot work within a limited time. So, the idea is to reduce the number of candidate pairs, instead of computing the similarity of 564: 125:
These three operations may be generalized as forms of substitution by adding a NULL character (here symbolized by *) wherever a character has been deleted or inserted:
918:
A computational survey of dictionary methods (i.e., methods that permit finding all dictionary words that approximately match a search pattern) is given by Boytsov.
526: 70:
The closeness of a match is measured in terms of the number of primitive operations necessary to convert the string into an exact match. This number is called the
1874: 884:(also known as the shift-or and shift-and algorithm), which is very efficient for relatively short pattern strings. The Bitap algorithm is the heart of the 1769: 873:
but solve different problems. Sellers' algorithm searches approximately for a substring in a text while the algorithm of Wagner and Fischer calculates
1808: 1734: 1739: 58:
approximately (rather than exactly). The problem of approximate string matching is typically divided into two sub-problems: finding approximate
1744: 1503: 1382: 1279: 1260: 856:
and other greedy and approximation algorithms. Most of them are designed to fit some framework (such as Map-Reduce) to compute concurrently.
692:), the only difference being that we must initialize the first row with zeros, and save the path of computation, that is, whether we used 1986: 1729: 1955: 1475:
Baeza-Yates, R.; Navarro, G. (June 1996). "A faster algorithm for approximate string matching". In Dan Hirchsberg; Gene Myers (eds.).
1823: 1764: 1636: 1671: 1588: 914:
methods. A detailed survey of indexing techniques that allows one to find an arbitrary substring in a text is given by Navarro
1851: 991: 945:
String matching cannot be used for most binary data, such as images and music. They require different algorithms, such as
1856: 1711: 1007: 1996: 1935: 1785: 898:
Although very fast on-line techniques exist, their performance on large data is unacceptable. Text preprocessing or
1940: 1818: 1721: 1598: 981: 849: 1691: 971: 1991: 1950: 1846: 1790: 1645: 902:
makes searching dramatically faster. Today, a variety of indexing algorithms have been presented. Among them are
1749: 379: 1945: 1681: 323: 1894: 258:
One possible definition of the approximate string matching problem is the following: Given a pattern string
1480: 1436: 1298: 769:), and follow the path of computation backwards, back to the row number 0. If the field we arrived at was 261: 834:
Another recent idea is the similarity join. When matching database relates to a large scale of data, the
1899: 1841: 1701: 1629: 899: 880:
On-line searching techniques have been repeatedly improved. Perhaps the most famous improvement is the
1348:
Sellers, Peter H. (1980). "The Theory and Computation of Evolutionary Distances: Pattern Recognition".
1219:
Boytsov, Leonid (2011). "Indexing methods for approximate dictionary searching: Comparative analysis".
1706: 1173: 996: 976: 946: 874: 677: 230:
by two substitutions. If all operations count as a single unit of cost and the limit is set to one,
30: 1904: 1485: 1441: 1303: 1001: 870: 478: 62:
matches inside a given string and finding dictionary strings that match the pattern approximately.
1833: 1800: 1537: 1514: 1454: 1415: 1316: 1236: 51: 956:
is often used to integrate approximate string searching into various command-line applications.
188:, in which the positions of two letters in the string are swapped, to be a primitive operation. 1965: 1499: 1378: 1275: 1256: 676:) is very similar to computing the edit distance between two strings. In fact, we can use the 531: 1960: 1930: 1884: 1686: 1622: 1611:
natural language processing library which includes implementations of popular string metrics
1562: 1529: 1446: 1405: 1357: 1308: 1228: 889: 865: 39: 504: 1813: 1754: 1666: 1515:"A fast bit-vector algorithm for approximate string matching based on dynamic programming" 1022: 881: 864:
Traditionally, approximate string matching algorithms are classified into two categories:
1591:
with recent advances in approximate string matching based on an edit distance threshold.
34:
A fuzzy Mediawiki search for "angry emoticon" has as a suggested result "andré emotions"
1866: 1759: 1272:
Algorithms on strings, trees, and sequences: computer science and computational biology
966: 939: 935: 927: 835: 818: 817:) time with the dynamic programming algorithm, while the backwards-working phase takes 808: 641:, we can easily find a solution to the original problem: it is the substring for which 461: 1567: 1550: 1204: 17: 1980: 1676: 1653: 1427:
Zobel, Justin; Dart, Philip (1995). "Finding approximate matches in large lexicons".
1361: 1320: 1248: 1017: 71: 1419: 1240: 934:
sequences has become an important application. Approximate matching is also used in
848:
pairs of strings. Widely used algorithms are based on filter-verification, hashing,
1879: 1696: 1541: 1458: 1329: 1889: 942:
is a common application where records from two disparate databases are matched.
907: 903: 1608: 1328:
Navarro, Gonzalo; Baeza-Yates, Ricardo; Sutinen, Erkki; Tarhio, Jorma (2001).
931: 1132: 1130: 1232: 986: 59: 1450: 1533: 1410: 1393: 1312: 1604: 1289:
Navarro, Gonzalo (2001). "A guided tour to approximate string matching".
74:
between the string and the pattern. The usual primitive operations are:
1925: 1594: 1012: 601:, and determine which one of them has the minimal edit distance to the 481:. It uses an alternative formulation of the problem: for each position 55: 751:) values, we then choose the minimal value in the last row, let it be 911: 895:. A review of on-line searching algorithms was done by G. Navarro. 930:. With the availability of large amounts of DNA data, matching of 892: 788:
is a substring of T with the minimal edit distance to the pattern
1909: 885: 853: 1618: 1136: 1583: 1049: 1047: 1045: 477:
A better solution, which was proposed by Sellers, relies on
1614: 1105: 1103: 1101: 877:, being appropriate for dictionary fuzzy search only. 1174:"Fzf - A Quick Fuzzy File Search from Linux Terminal" 1064: 1062: 534: 507: 382: 326: 264: 926:
Common applications of approximate matching include
1918: 1865: 1832: 1799: 1778: 1720: 1652: 1477:Combinatorial Pattern Matching (CPM'96), LNCS 1075 1330:"Indexing Methods for Approximate String Matching" 1205:"Fast Approximate String Matching in a Dictionary" 1121: 558: 520: 437: 368: 312: 27:Finding strings that approximately match a pattern 1601:library of string metrics and phonetic algorithms 497:, compute the minimum edit distance between the 453:, has the smallest edit distance to the pattern 1053: 1630: 1589:Efficient Similarity Query Processing Project 1274:. Cambridge, UK: Cambridge University Press. 1080: 8: 1551:"Algorithms for approximate string matching" 1255:(2nd ed.). MIT Press. pp. 364–7. 1637: 1623: 1615: 438:{\displaystyle T_{j',j}=t_{j'}\dots t_{j}} 1566: 1484: 1440: 1409: 1394:"The string-to-string correction problem" 1302: 1148: 539: 533: 512: 506: 429: 411: 387: 381: 360: 347: 337: 325: 304: 285: 275: 263: 1809:Comparison of regular-expression engines 1494:Galil, Zvi; Apostolico, Alberto (1997). 1109: 678:Levenshtein distance computing algorithm 29: 1160: 1092: 1068: 1041: 369:{\displaystyle T=t_{1}t_{2}\dots t_{n}} 1770:Zhu–Takaoka string matching algorithm 1203:Baeza-Yates, R.; Navarro, G. (1998). 184:Some approximate matchers also treat 7: 1498:. Oxford : Oxford University Press. 1221:Journal of Experimental Algorithmics 313:{\displaystyle P=p_{1}p_{2}...p_{m}} 1735:Boyer–Moore string-search algorithm 724: − 1) in computing 46:(often colloquially referred to as 254:Problem formulation and algorithms 25: 1824:Nondeterministic finite automaton 1765:Two-way string-matching algorithm 1429:Software: Practice and Experience 609:. Write this minimal distance as 501:first characters of the pattern, 1392:Wagner, R.; Fischer, M. (1974). 1214:. IEEE CS Press. pp. 14–22. 1004:for fuzzy and non-fuzzy matching 657:being the length of the pattern 605:first characters of the pattern 593:, go through all substrings of 50:) is the technique of finding 1740:Boyer–Moore–Horspool algorithm 1730:Apostolico–Giancarlo algorithm 1337:IEEE Data Engineering Bulletin 1122:Baeza-Yates & Navarro 1998 1025:for Semantic Similarity Search 449:, which, of all substrings of 1: 1568:10.1016/S0019-9958(85)80046-2 1479:. Irvine, CA. pp. 1–23. 218:differs by one substitution, 1745:Knuth–Morris–Pratt algorithm 1672:Damerau–Levenshtein distance 1362:10.1016/0196-6774(80)90016-4 1251:; Leiserson, Rivest (2001). 739:In the array containing the 242:will count as matches while 1936:Compressed pattern matching 1662:Approximate string matching 1496:Pattern matching algorithms 1054:Cormen & Leiserson 2001 952:A common command-line tool 44:approximate string matching 2013: 1987:String matching algorithms 1941:Longest common subsequence 1852:Needleman–Wunsch algorithm 1722:String-searching algorithm 1377:(1st ed.). Springer. 1253:Introduction to Algorithms 992:Needleman–Wunsch algorithm 982:Locality-sensitive hashing 850:Locality-sensitive hashing 1951:Sequential pattern mining 1791:Commentz-Walter algorithm 1779:Multiple string searching 1712:Wagner–Fischer algorithm 1081:Wagner & Fischer 1974 712: − 1) or 1961:String rewriting systems 1946:Longest common substring 1857:Smith–Waterman algorithm 1682:Gestalt pattern matching 1008:Smith–Waterman algorithm 559:{\displaystyle T_{j',j}} 1895:Generalized suffix tree 1819:Thompson's construction 1555:Information and Control 1375:Algorithm Design Manual 1233:10.1145/1963190.1963191 947:acoustic fingerprinting 860:On-line versus off-line 1847:Hirschberg's algorithm 1513:Myers, G. (May 1999). 1451:10.1002/spe.4380250307 1373:Skiena, Steve (1998). 1270:Gusfield, Dan (1997). 570:that ends at position 560: 522: 439: 370: 314: 48:fuzzy string searching 35: 18:Fuzzy string searching 1702:Levenshtein automaton 1692:Jaro–Winkler distance 1534:10.1145/316542.316550 1411:10.1145/321796.321811 1350:Journal of Algorithms 1313:10.1145/375360.375365 1291:ACM Computing Surveys 1149:Zobel & Dart 1995 972:Jaro–Winkler distance 720: − 1, 700: − 1, 561: 523: 521:{\displaystyle P_{i}} 440: 371: 315: 226:by one deletion, and 33: 1750:Rabin–Karp algorithm 1707:Levenshtein distance 1595:StringMetric project 1549:Ukkonen, E. (1985). 997:Plagiarism detection 977:Levenshtein distance 875:Levenshtein distance 585:, and each position 532: 528:, and any substring 505: 380: 324: 262: 1997:Dynamic programming 1905:Ternary search tree 1137:Navarro et al. 2001 1002:Regular expressions 871:dynamic programming 621:). After computing 597:ending at position 479:dynamic programming 376:, find a substring 1834:Sequence alignment 1801:Regular expression 1522:Journal of the ACM 1398:Journal of the ACM 577:For each position 556: 518: 489:and each position 435: 366: 320:and a text string 310: 222:by one insertion, 36: 1974: 1973: 1966:String operations 1505:978-0-19-511367-9 1384:978-0-387-94860-7 1281:978-0-521-58519-4 1262:978-0-262-03293-3 16:(Redirected from 2004: 1992:Pattern matching 1931:Pattern matching 1885:Suffix automaton 1687:Hamming distance 1639: 1632: 1625: 1616: 1584:Flamingo Project 1572: 1570: 1545: 1519: 1509: 1490: 1488: 1462: 1444: 1423: 1413: 1388: 1371: 1365: 1344: 1334: 1324: 1306: 1285: 1266: 1244: 1215: 1209: 1189: 1188: 1186: 1185: 1170: 1164: 1158: 1152: 1146: 1140: 1134: 1125: 1119: 1113: 1107: 1096: 1090: 1084: 1078: 1072: 1066: 1057: 1051: 565: 563: 562: 557: 555: 554: 547: 527: 525: 524: 519: 517: 516: 444: 442: 441: 436: 434: 433: 421: 420: 419: 403: 402: 395: 375: 373: 372: 367: 365: 364: 352: 351: 342: 341: 319: 317: 316: 311: 309: 308: 290: 289: 280: 279: 192:transposition: 40:computer science 21: 2012: 2011: 2007: 2006: 2005: 2003: 2002: 2001: 1977: 1976: 1975: 1970: 1914: 1861: 1828: 1814:Regular grammar 1795: 1774: 1755:Raita algorithm 1716: 1667:Bitap algorithm 1648: 1643: 1605:Natural project 1580: 1575: 1561:(1–3): 100–18. 1548: 1517: 1512: 1506: 1493: 1474: 1470: 1468:Further reading 1465: 1426: 1391: 1385: 1372: 1368: 1347: 1332: 1327: 1288: 1282: 1269: 1263: 1247: 1218: 1207: 1202: 1198: 1193: 1192: 1183: 1181: 1178:www.tecmint.com 1172: 1171: 1167: 1159: 1155: 1147: 1143: 1135: 1128: 1120: 1116: 1108: 1099: 1091: 1087: 1079: 1075: 1067: 1060: 1052: 1043: 1038: 1033: 1028: 1023:Vector database 962: 924: 882:bitap algorithm 862: 784: ...  779: 768: 761: 589:in the pattern 540: 535: 530: 529: 508: 503: 502: 493:in the pattern 425: 412: 407: 388: 383: 378: 377: 356: 343: 333: 322: 321: 300: 281: 271: 260: 259: 256: 165:substitution: 68: 28: 23: 22: 15: 12: 11: 5: 2010: 2008: 2000: 1999: 1994: 1989: 1979: 1978: 1972: 1971: 1969: 1968: 1963: 1958: 1953: 1948: 1943: 1938: 1933: 1928: 1922: 1920: 1916: 1915: 1913: 1912: 1907: 1902: 1897: 1892: 1887: 1882: 1877: 1871: 1869: 1867:Data structure 1863: 1862: 1860: 1859: 1854: 1849: 1844: 1838: 1836: 1830: 1829: 1827: 1826: 1821: 1816: 1811: 1805: 1803: 1797: 1796: 1794: 1793: 1788: 1782: 1780: 1776: 1775: 1773: 1772: 1767: 1762: 1760:Trigram search 1757: 1752: 1747: 1742: 1737: 1732: 1726: 1724: 1718: 1717: 1715: 1714: 1709: 1704: 1699: 1694: 1689: 1684: 1679: 1674: 1669: 1664: 1658: 1656: 1650: 1649: 1644: 1642: 1641: 1634: 1627: 1619: 1613: 1612: 1602: 1592: 1586: 1579: 1578:External links 1576: 1574: 1573: 1546: 1528:(3): 395–415. 1510: 1504: 1491: 1486:10.1.1.42.1593 1471: 1469: 1466: 1464: 1463: 1442:10.1.1.14.3856 1435:(3): 331–345. 1424: 1389: 1383: 1366: 1345: 1325: 1304:10.1.1.96.7225 1286: 1280: 1267: 1261: 1249:Cormen, Thomas 1245: 1216: 1212:Proc. SPIRE'98 1199: 1197: 1194: 1191: 1190: 1165: 1153: 1141: 1126: 1114: 1097: 1085: 1073: 1058: 1040: 1039: 1037: 1034: 1032: 1029: 1027: 1026: 1020: 1015: 1010: 1005: 999: 994: 989: 984: 979: 974: 969: 967:Concept search 963: 961: 958: 940:Record linkage 936:spam filtering 928:spell checking 923: 920: 861: 858: 807:) array takes 795:Computing the 777: 766: 759: 653:) is minimal ( 553: 550: 546: 543: 538: 515: 511: 432: 428: 424: 418: 415: 410: 406: 401: 398: 394: 391: 386: 363: 359: 355: 350: 346: 340: 336: 332: 329: 307: 303: 299: 296: 293: 288: 284: 278: 274: 270: 267: 255: 252: 207: 206: 182: 181: 163: 145: 123: 122: 106:substitution: 104: 90: 67: 64: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 2009: 1998: 1995: 1993: 1990: 1988: 1985: 1984: 1982: 1967: 1964: 1962: 1959: 1957: 1954: 1952: 1949: 1947: 1944: 1942: 1939: 1937: 1934: 1932: 1929: 1927: 1924: 1923: 1921: 1917: 1911: 1908: 1906: 1903: 1901: 1898: 1896: 1893: 1891: 1888: 1886: 1883: 1881: 1878: 1876: 1873: 1872: 1870: 1868: 1864: 1858: 1855: 1853: 1850: 1848: 1845: 1843: 1840: 1839: 1837: 1835: 1831: 1825: 1822: 1820: 1817: 1815: 1812: 1810: 1807: 1806: 1804: 1802: 1798: 1792: 1789: 1787: 1784: 1783: 1781: 1777: 1771: 1768: 1766: 1763: 1761: 1758: 1756: 1753: 1751: 1748: 1746: 1743: 1741: 1738: 1736: 1733: 1731: 1728: 1727: 1725: 1723: 1719: 1713: 1710: 1708: 1705: 1703: 1700: 1698: 1695: 1693: 1690: 1688: 1685: 1683: 1680: 1678: 1677:Edit distance 1675: 1673: 1670: 1668: 1665: 1663: 1660: 1659: 1657: 1655: 1654:String metric 1651: 1647: 1640: 1635: 1633: 1628: 1626: 1621: 1620: 1617: 1610: 1606: 1603: 1600: 1596: 1593: 1590: 1587: 1585: 1582: 1581: 1577: 1569: 1564: 1560: 1556: 1552: 1547: 1543: 1539: 1535: 1531: 1527: 1523: 1516: 1511: 1507: 1501: 1497: 1492: 1487: 1482: 1478: 1473: 1472: 1467: 1460: 1456: 1452: 1448: 1443: 1438: 1434: 1430: 1425: 1421: 1417: 1412: 1407: 1403: 1399: 1395: 1390: 1386: 1380: 1376: 1370: 1367: 1363: 1359: 1356:(4): 359–73. 1355: 1351: 1346: 1342: 1338: 1331: 1326: 1322: 1318: 1314: 1310: 1305: 1300: 1296: 1292: 1287: 1283: 1277: 1273: 1268: 1264: 1258: 1254: 1250: 1246: 1242: 1238: 1234: 1230: 1226: 1222: 1217: 1213: 1206: 1201: 1200: 1195: 1179: 1175: 1169: 1166: 1162: 1157: 1154: 1150: 1145: 1142: 1138: 1133: 1131: 1127: 1123: 1118: 1115: 1111: 1110:Gusfield 1997 1106: 1104: 1102: 1098: 1094: 1089: 1086: 1082: 1077: 1074: 1070: 1065: 1063: 1059: 1055: 1050: 1048: 1046: 1042: 1035: 1030: 1024: 1021: 1019: 1018:String metric 1016: 1014: 1011: 1009: 1006: 1003: 1000: 998: 995: 993: 990: 988: 985: 983: 980: 978: 975: 973: 970: 968: 965: 964: 959: 957: 955: 950: 948: 943: 941: 937: 933: 929: 921: 919: 917: 913: 909: 905: 901: 896: 894: 891: 887: 883: 878: 876: 872: 867: 859: 857: 855: 851: 847: 843: 839: 838: 832: 830: 827: +  826: 822: 821: 816: 812: 811: 806: 802: 798: 793: 791: 787: 783: 776: 772: 765: 758: 754: 750: 746: 742: 737: 735: 731: 727: 723: 719: 715: 711: 707: 703: 699: 695: 691: 687: 683: 679: 675: 671: 667: 662: 660: 656: 652: 648: 644: 640: 636: 632: 628: 624: 620: 616: 612: 608: 604: 600: 596: 592: 588: 584: 580: 575: 573: 569: 551: 548: 544: 541: 536: 513: 509: 500: 496: 492: 488: 484: 480: 475: 473: 469: 465: 464: 458: 456: 452: 448: 430: 426: 422: 416: 413: 408: 404: 399: 396: 392: 389: 384: 361: 357: 353: 348: 344: 338: 334: 330: 327: 305: 301: 297: 294: 291: 286: 282: 276: 272: 268: 265: 253: 251: 247: 245: 241: 237: 233: 229: 225: 221: 217: 213: 205: 204: 198: 197: 191: 190: 189: 187: 186:transposition 180: 178: 172: 170: 164: 162: 160: 154: 152: 146: 144: 142: 136: 134: 128: 127: 126: 121: 119: 113: 111: 105: 103: 99: 97: 91: 89: 87: 81: 77: 76: 75: 73: 72:edit distance 65: 63: 61: 57: 54:that match a 53: 49: 45: 41: 32: 19: 1880:Suffix array 1786:Aho–Corasick 1697:Lee distance 1661: 1558: 1554: 1525: 1521: 1495: 1476: 1432: 1428: 1401: 1397: 1374: 1369: 1353: 1349: 1340: 1336: 1297:(1): 31–88. 1294: 1290: 1271: 1252: 1224: 1220: 1211: 1182:. Retrieved 1180:. 2018-11-08 1177: 1168: 1161:Boytsov 2011 1156: 1144: 1117: 1093:Navarro 2001 1088: 1076: 1069:Sellers 1980 953: 951: 944: 925: 922:Applications 915: 908:metric trees 904:suffix trees 897: 879: 863: 845: 841: 836: 833: 828: 824: 819: 814: 809: 804: 800: 796: 794: 789: 785: 781: 774: 770: 763: 756: 752: 748: 744: 740: 738: 733: 729: 725: 721: 717: 713: 709: 705: 701: 697: 693: 689: 685: 681: 673: 669: 665: 663: 658: 654: 650: 646: 642: 638: 634: 630: 626: 622: 618: 614: 610: 606: 602: 598: 594: 590: 586: 582: 581:in the text 578: 576: 571: 567: 498: 494: 490: 486: 485:in the text 482: 476: 471: 467: 462: 459: 454: 450: 446: 257: 248: 243: 239: 235: 231: 227: 223: 219: 215: 211: 208: 202: 200: 195: 193: 185: 183: 176: 174: 168: 166: 158: 156: 150: 148: 140: 138: 132: 130: 124: 117: 115: 109: 107: 101: 95: 93: 85: 83: 79: 69: 47: 43: 37: 1890:Suffix tree 1343:(4): 19–27. 1227:(1): 1–91. 1196:Works cited 147:deletion: 129:insertion: 92:deletion: 78:insertion: 1981:Categories 1609:JavaScript 1404:: 168–73. 1184:2022-09-08 1031:References 932:nucleotide 888:searching 664:Computing 633:) for all 246:will not. 1481:CiteSeerX 1437:CiteSeerX 1321:207551224 1299:CiteSeerX 1036:Citations 987:Metaphone 773:(0,  423:… 354:… 60:substring 1420:13381535 1241:15635688 960:See also 900:indexing 831:) time. 780:), then 545:′ 417:′ 393:′ 66:Overview 1956:Sorting 1926:Parsing 1646:Strings 1542:1158099 1459:6776819 1013:Soundex 890:utility 866:on-line 852:(LSH), 803:,  762:,  747:,  732:,  688:,  672:,  649:,  629:,  617:,  56:pattern 52:strings 1540:  1502:  1483:  1457:  1439:  1418:  1381:  1319:  1301:  1278:  1259:  1239:  916:et al. 912:n-gram 470:  238:, and 1919:Other 1875:DAFSA 1842:BLAST 1599:Scala 1538:S2CID 1518:(PDF) 1455:S2CID 1416:S2CID 1333:(PDF) 1317:S2CID 1237:S2CID 1208:(PDF) 893:agrep 854:Tries 704:), E( 236:coils 220:coils 1910:Trie 1900:Rope 1500:ISBN 1379:ISBN 1276:ISBN 1257:ISBN 910:and 886:Unix 680:for 637:and 244:foal 232:foil 228:foal 216:foil 212:coil 1563:doi 1530:doi 1447:doi 1406:doi 1358:doi 1309:doi 1229:doi 954:fzf 869:on 846:all 736:). 661:.) 566:of 474:). 445:in 240:oil 224:oil 102:cot 80:cot 38:In 1983:: 1607:a 1597:a 1559:64 1557:. 1553:. 1536:. 1526:46 1524:. 1520:. 1453:. 1445:. 1433:25 1431:. 1414:. 1402:21 1400:. 1396:. 1352:. 1341:24 1339:. 1335:. 1315:. 1307:. 1295:33 1293:. 1235:. 1225:16 1223:. 1210:. 1176:. 1129:^ 1100:^ 1061:^ 1044:^ 949:. 938:. 906:, 842:mn 815:mn 792:. 574:. 457:. 234:, 214:, 203:ts 201:co 199:→ 196:st 194:co 175:co 173:→ 167:co 157:co 155:→ 149:co 139:co 137:→ 131:co 116:co 114:→ 108:co 100:→ 94:co 84:co 82:→ 42:, 1638:e 1631:t 1624:v 1571:. 1565:: 1544:. 1532:: 1508:. 1489:. 1461:. 1449:: 1422:. 1408:: 1387:. 1364:. 1360:: 1354:1 1323:. 1311:: 1284:. 1265:. 1243:. 1231:: 1187:. 1163:. 1151:. 1139:. 1124:. 1112:. 1095:. 1083:. 1071:. 1056:. 840:( 837:O 829:m 825:n 823:( 820:O 813:( 810:O 805:y 801:x 799:( 797:E 790:P 786:T 782:T 778:1 775:y 771:E 767:2 764:y 760:2 757:x 755:( 753:E 749:y 745:x 743:( 741:E 734:j 730:i 728:( 726:E 722:j 718:i 716:( 714:E 710:j 708:, 706:i 702:j 698:i 696:( 694:E 690:j 686:m 684:( 682:E 674:j 670:m 668:( 666:E 659:P 655:m 651:j 647:m 645:( 643:E 639:j 635:i 631:j 627:i 625:( 623:E 619:j 615:i 613:( 611:E 607:P 603:i 599:j 595:T 591:P 587:i 583:T 579:j 572:j 568:T 552:j 549:, 542:j 537:T 514:i 510:P 499:i 495:P 491:i 487:T 483:j 472:m 468:n 466:( 463:O 455:P 451:T 447:T 431:j 427:t 414:j 409:t 405:= 400:j 397:, 390:j 385:T 362:n 358:t 349:2 345:t 339:1 335:t 331:= 328:T 306:m 302:p 298:. 295:. 292:. 287:2 283:p 277:1 273:p 269:= 266:P 179:t 177:s 171:t 169:a 161:t 159:* 153:t 151:a 143:t 141:a 135:t 133:* 120:t 118:s 112:t 110:a 98:t 96:a 88:t 86:a 20:)

Index

Fuzzy string searching

computer science
strings
pattern
substring
edit distance
O
dynamic programming
Levenshtein distance computing algorithm
O
O
O
Locality-sensitive hashing
Tries
on-line
dynamic programming
Levenshtein distance
bitap algorithm
Unix
utility
agrep
indexing
suffix trees
metric trees
n-gram
spell checking
nucleotide
spam filtering
Record linkage

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑