Knowledge

Caverphone

Source 📝

60:
in 2002, revised in 2004. It was created to assist in data matching between late 19th century and early 20th century electoral rolls, where the name only needed to be in a "commonly recognisable form". The algorithm was intended to apply to those names that could not easily be matched between
1252:
Thompson -> thompson thompson -> th3mps3n th3mps3n -> th3mpS3n th3mpS3n -> Th3mpS3n Th3mpS3n -> Th3mPS3n Th3mPS3n -> Th3MPS3n Th3MPS3n -> Th3MPS3N Th3MPS3N -> T23MPS3N T23MPS3N -> TMPSN TMPSN1111111111 -> TMPSN11111
1240:
Thompson -> thompson thompson -> th3mps3n th3mps3n -> th3mpS3n th3mpS3n -> Th3mpS3n Th3mpS3n -> Th3mPS3n Th3mPS3n -> Th3MPS3n Th3MPS3n -> Th3MPS3N Th3MPS3N -> T23MPS3N T23MPS3N -> TMPSN TMPSN111111 -> TMPSN1
61:
electoral rolls, after the exact matches were removed from the pool of potential matches. The algorithm is optimised for accents present in the study area (southern part of the city of
1267: 1368: 32:
invented to identify English names with their sounds, originally built to process a custom dataset compound between 1893 and 1938 in southern
1313: 1450: 29: 1336:
Phua, Clifton; Lee, Vincent; Smith, Kate (2006). "The Personal Name Problem And a Recommended Data Mining Solution".
1432: 1249:
Lee -> lee lee -> le le -> l3 l3 -> L3 L3 -> LA LA -> LA1111111111 LA1111111111 -> LA11111111
1341: 78:
The rules of the algorithm are applied consecutively to any particular name, as a series of replacements.
1272: 1220:
Vowels are normally a, e, i, o, u but depending on the data might include characters such as æ, ā, or ø
1346: 53: 1309: 1303: 1282: 49: 1406: 1401: 1237:
Lee -> lee lee -> l33 l33 -> L33 L33 -> L L -> L111111 L111111 -> L11111
1444: 1364: 1420: 57: 21: 40:, it has been developed to accommodate and process general English since then. 1277: 37: 25: 1426: 1395: 1262: 62: 33: 1414: 1394:- Caversham data set of names and accents in the southern part of 1391: 1200:
This may vary if the set of letters includes characters such as
624:
Remove anything not in the standard alphabet (typically
1268:
New York State Identification and Intelligence System
36:, New Zealand. Started from a similar concept as 48:The Caverphone was created by David Hood in the 1369:National Institute of Standards and Technology 8: 1338:Encyclopedia of Data Warehousing and Mining 1302:Milette, Greg; Stroud, Adam (2012-05-18). 1359: 1357: 1345: 1308:. John Wiley & Sons. pp. 421–. 1305:Professional Android Sensor Programming 1294: 1193: 7: 1402:Original (2002) Caverphone algorithm 1407:Revised (2004) Caverphone algorithm 1433:caverphone algorithm (version 2.0) 14: 1435:- AdvaS Advanced Search project 1: 81:The algorithm is as follows: 1419:Java implementation in the 1398:, New Zealand in 1893-1938. 97:If the name starts with... 30:phonetic matching algorithm 1467: 1415:C# Revised Implementation 1072:all other occurrences of 489:all other occurrences of 637:If the name starts with 1431:Python Implementation 696:If the name ends with 147:If the name ends with 1273:Match rating approach 1011:groups of the letter 1001:groups of the letter 991:groups of the letter 981:groups of the letter 971:groups of the letter 961:groups of the letter 951:groups of the letter 424:groups of the letter 414:groups of the letter 404:groups of the letter 394:groups of the letter 384:groups of the letter 374:groups of the letter 364:groups of the letter 1421:Apache Commons Codec 1185:first ten characters 1159:, replace the final 1123:if the name ends in 1091:if the name ends in 1039:if the name ends in 621:Convert to lowercase 606:first six characters 91:Remove anything not 1451:Phonetic algorithms 1155:if the name end in 54:University of Otago 1427:PHP implementation 1127:replace the final 1095:replace the final 1043:replace the final 1411:Implementations: 1392:Caversham Project 1283:Cologne phonetics 618:Start with a word 50:Caversham Project 1458: 1379: 1378: 1376: 1375: 1361: 1352: 1351: 1349: 1333: 1327: 1326: 1324: 1322: 1299: 1221: 1218: 1212: 1198: 153:, replace it by 139:, replace it by 130:, replace it by 121:, replace it by 112:, replace it by 103:, replace it by 65:, New Zealand). 1466: 1465: 1461: 1460: 1459: 1457: 1456: 1455: 1441: 1440: 1388: 1383: 1382: 1373: 1371: 1363: 1362: 1355: 1347:10.1.1.127.5111 1335: 1334: 1330: 1320: 1318: 1316: 1301: 1300: 1296: 1291: 1259: 1254: 1250: 1247: 1242: 1238: 1235: 1230: 1225: 1224: 1219: 1215: 1199: 1195: 1190: 615: 76: 71: 46: 12: 11: 5: 1464: 1462: 1454: 1453: 1443: 1442: 1439: 1438: 1437: 1436: 1429: 1424: 1417: 1409: 1404: 1399: 1387: 1386:External links 1384: 1381: 1380: 1353: 1328: 1314: 1293: 1292: 1290: 1287: 1286: 1285: 1280: 1275: 1270: 1265: 1258: 1255: 1251: 1248: 1246: 1245:Caverphone 2.0 1243: 1239: 1236: 1234: 1233:Caverphone 1.0 1231: 1229: 1226: 1223: 1222: 1213: 1192: 1191: 1189: 1188: 1181: 1174: 1167: 1153: 1146: 1145: 1144: 1135: 1121: 1112: 1103: 1089: 1080: 1070: 1060: 1051: 1037: 1028: 1019: 1009: 999: 989: 979: 969: 959: 949: 940: 931: 922: 913: 903: 893: 884: 874: 864: 855: 846: 837: 828: 819: 810: 801: 792: 783: 774: 765: 756: 747: 738: 729: 720: 708: 707: 706: 694: 693: 692: 683: 674: 665: 656: 647: 635: 629: 622: 619: 614: 613:Caverphone 2.0 611: 610: 609: 602: 595: 594: 593: 588: 580: 579: 578: 569: 560: 551: 542: 533: 524: 515: 506: 497: 487: 477: 468: 459: 450: 441: 432: 422: 412: 402: 392: 382: 372: 362: 353: 344: 335: 325: 315: 306: 297: 288: 279: 270: 261: 252: 243: 234: 225: 216: 207: 198: 189: 180: 171: 159: 158: 157: 145: 144: 143: 134: 125: 116: 107: 95: 89: 75: 74:Caverphone 1.0 72: 70: 67: 45: 42: 13: 10: 9: 6: 4: 3: 2: 1463: 1452: 1449: 1448: 1446: 1434: 1430: 1428: 1425: 1422: 1418: 1416: 1413: 1412: 1410: 1408: 1405: 1403: 1400: 1397: 1393: 1390: 1389: 1385: 1370: 1366: 1360: 1358: 1354: 1348: 1343: 1339: 1332: 1329: 1317: 1315:9781118240458 1311: 1307: 1306: 1298: 1295: 1288: 1284: 1281: 1279: 1276: 1274: 1271: 1269: 1266: 1264: 1261: 1260: 1256: 1244: 1232: 1227: 1217: 1214: 1211: 1207: 1203: 1197: 1194: 1186: 1182: 1179: 1175: 1172: 1168: 1166: 1162: 1158: 1154: 1151: 1147: 1143: 1139: 1136: 1134: 1130: 1126: 1122: 1120: 1116: 1113: 1111: 1107: 1104: 1102: 1098: 1094: 1090: 1088: 1084: 1081: 1079: 1075: 1071: 1069: 1065: 1061: 1059: 1055: 1052: 1050: 1046: 1042: 1038: 1036: 1032: 1029: 1027: 1023: 1020: 1018: 1014: 1010: 1008: 1004: 1000: 998: 994: 990: 988: 984: 980: 978: 974: 970: 968: 964: 960: 958: 954: 950: 948: 944: 941: 939: 935: 932: 930: 926: 923: 921: 917: 914: 912: 908: 904: 902: 898: 894: 892: 888: 885: 883: 879: 875: 873: 869: 865: 863: 859: 856: 854: 850: 847: 845: 841: 838: 836: 832: 829: 827: 823: 820: 818: 814: 811: 809: 805: 802: 800: 796: 793: 791: 787: 784: 782: 778: 775: 773: 769: 766: 764: 760: 757: 755: 751: 748: 746: 742: 739: 737: 733: 730: 728: 724: 721: 719: 715: 712: 711: 709: 705: 701: 698: 697: 695: 691: 687: 684: 682: 678: 675: 673: 669: 666: 664: 660: 657: 655: 651: 648: 646: 642: 639: 638: 636: 634: 631:Remove final 630: 627: 623: 620: 617: 616: 612: 607: 603: 600: 596: 592: 589: 587: 584: 583: 581: 577: 573: 570: 568: 564: 561: 559: 555: 552: 550: 546: 543: 541: 537: 534: 532: 528: 525: 523: 519: 516: 514: 510: 507: 505: 501: 498: 496: 492: 488: 486: 482: 478: 476: 472: 469: 467: 463: 460: 458: 454: 451: 449: 445: 442: 440: 436: 433: 431: 427: 423: 421: 417: 413: 411: 407: 403: 401: 397: 393: 391: 387: 383: 381: 377: 373: 371: 367: 363: 361: 357: 354: 352: 348: 345: 343: 339: 336: 334: 330: 326: 324: 320: 316: 314: 310: 307: 305: 301: 298: 296: 292: 289: 287: 283: 280: 278: 274: 271: 269: 265: 262: 260: 256: 253: 251: 247: 244: 242: 238: 235: 233: 229: 226: 224: 220: 217: 215: 211: 208: 206: 202: 199: 197: 193: 190: 188: 184: 181: 179: 175: 172: 170: 166: 163: 162: 160: 156: 152: 149: 148: 146: 142: 138: 135: 133: 129: 126: 124: 120: 117: 115: 111: 108: 106: 102: 99: 98: 96: 94: 90: 88: 84: 83: 82: 79: 73: 68: 66: 64: 59: 55: 51: 43: 41: 39: 35: 31: 27: 23: 19: 1372:. Retrieved 1365:"Caverphone" 1337: 1331: 1319:. Retrieved 1304: 1297: 1216: 1209: 1205: 1201: 1196: 1184: 1180:s on the end 1177: 1170: 1164: 1160: 1156: 1149: 1141: 1137: 1132: 1128: 1124: 1118: 1114: 1109: 1105: 1100: 1096: 1092: 1086: 1082: 1077: 1073: 1067: 1063: 1057: 1053: 1048: 1044: 1040: 1034: 1030: 1025: 1021: 1016: 1012: 1006: 1002: 996: 992: 986: 982: 976: 972: 966: 962: 956: 952: 946: 942: 937: 933: 928: 924: 919: 915: 910: 906: 900: 896: 890: 886: 881: 877: 871: 867: 861: 857: 852: 848: 843: 839: 834: 830: 825: 821: 816: 812: 807: 803: 798: 794: 789: 785: 780: 776: 771: 767: 762: 758: 753: 749: 744: 740: 735: 731: 726: 722: 717: 713: 703: 699: 689: 685: 680: 676: 671: 667: 662: 658: 653: 649: 644: 640: 632: 625: 605: 598: 590: 585: 575: 571: 566: 562: 557: 553: 548: 544: 539: 535: 530: 526: 521: 517: 512: 508: 503: 499: 494: 490: 484: 480: 479:any initial 474: 470: 465: 461: 456: 452: 447: 443: 438: 434: 429: 425: 419: 415: 409: 405: 399: 395: 389: 385: 379: 375: 369: 365: 359: 355: 350: 346: 341: 337: 332: 328: 322: 318: 317:any initial 312: 308: 303: 299: 294: 290: 285: 281: 276: 272: 267: 263: 258: 254: 249: 245: 240: 236: 231: 227: 222: 218: 213: 209: 204: 200: 195: 191: 186: 182: 177: 173: 168: 164: 154: 150: 140: 136: 131: 127: 122: 118: 113: 109: 104: 100: 92: 86: 80: 77: 47: 17: 15: 1321:19 February 1187:as the code 1169:remove all 1148:remove all 1062:an initial 905:an initial 895:an initial 866:an initial 608:as the code 582:remove all 85:Convert to 58:New Zealand 22:linguistics 1374:2018-08-20 1289:References 876:all other 601:on the end 327:all other 18:Caverphone 1342:CiteSeerX 1278:Metaphone 1183:take the 604:take the 87:lowercase 69:Procedure 44:Etymology 38:metaphone 26:computing 1445:Category 1257:See also 1228:Examples 1176:put ten 1066:with an 870:with an 710:Replace 702:make it 688:make it 679:make it 670:make it 661:make it 652:make it 643:make it 597:put six 483:with an 321:with an 161:Replace 1423:project 1396:Dunedin 1263:Soundex 1076:with a 1015:with a 1005:with a 995:with a 985:with a 975:with a 965:with a 955:with a 880:with a 493:with a 428:with a 418:with a 408:with a 398:with a 388:with a 378:with a 368:with a 331:with a 63:Dunedin 52:at the 34:Dunedin 28:, is a 20:within 1344:  1312:  878:vowels 681:trou2f 677:trough 672:enou2f 668:enough 329:vowels 132:enou2f 128:enough 1208:, or 1163:with 1140:with 1131:with 1117:with 1108:with 1099:with 1085:with 1056:with 1047:with 1033:with 1024:with 945:with 936:with 927:with 918:with 909:with 899:with 889:with 868:vowel 860:with 851:with 842:with 833:with 824:with 815:with 806:with 797:with 788:with 779:with 770:with 761:with 752:with 743:with 734:with 725:with 716:with 663:tou2f 659:tough 654:rou2f 650:rough 645:cou2f 641:cough 574:with 565:with 556:with 547:with 538:with 529:with 520:with 511:with 502:with 473:with 464:with 455:with 446:with 437:with 358:with 349:with 340:with 319:vowel 311:with 302:with 293:with 284:with 275:with 266:with 257:with 248:with 239:with 230:with 221:with 212:with 203:with 194:with 185:with 176:with 167:with 123:tou2f 119:tough 114:rou2f 110:rough 105:cou2f 101:cough 1323:2013 1310:ISBN 929:3kh3 925:3gh3 342:3kh3 338:3gh3 24:and 16:The 1035:Wh3 1031:wh3 817:sia 813:tia 808:sio 804:tio 754:2ch 750:tch 626:a-z 466:Why 462:why 457:Wh3 453:wh3 268:sia 264:tia 259:sio 255:tio 205:2ch 201:tch 93:A-Z 56:in 1447:: 1367:. 1356:^ 1340:. 1204:, 1119:L3 1115:l3 1087:R3 1083:r3 1026:W3 1022:w3 938:22 934:gh 901:Y3 897:y3 853:s2 849:sh 835:fh 831:ph 799:2g 795:dg 745:sy 741:cy 736:se 732:ce 727:si 723:ci 718:2q 714:cq 704:m2 700:mb 690:2n 686:gn 567:Y3 563:y3 540:Ly 536:ly 531:L3 527:l3 513:Ry 509:ry 504:R3 500:r3 448:Wy 444:wy 439:W3 435:w3 351:22 347:gh 304:s2 300:sh 286:fh 282:ph 250:2g 246:dg 196:sy 192:cy 187:se 183:ce 178:si 174:ci 169:2q 165:cq 155:m2 151:mb 141:2n 137:gn 1377:. 1350:. 1325:. 1210:ø 1206:ā 1202:æ 1178:1 1173:s 1171:3 1165:A 1161:3 1157:3 1152:s 1150:2 1142:2 1138:l 1133:3 1129:l 1125:l 1110:2 1106:r 1101:3 1097:r 1093:r 1078:2 1074:h 1068:A 1064:h 1058:2 1054:w 1049:3 1045:w 1041:w 1017:N 1013:n 1007:M 1003:m 997:F 993:f 987:K 983:k 977:P 973:p 967:T 963:t 957:S 953:s 947:k 943:g 920:3 916:y 911:A 907:y 891:y 887:j 882:3 872:A 862:s 858:z 844:p 840:b 826:t 822:d 790:f 786:v 781:k 777:x 772:k 768:q 763:k 759:c 633:e 628:) 599:1 591:3 586:2 576:2 572:y 558:y 554:j 549:2 545:l 522:2 518:r 495:2 491:h 485:A 481:h 475:2 471:w 430:N 426:n 420:M 416:m 410:F 406:f 400:K 396:k 390:P 386:p 380:T 376:t 370:S 366:s 360:k 356:g 333:3 323:A 313:s 309:z 295:p 291:b 277:t 273:d 241:f 237:v 232:k 228:x 223:k 219:q 214:k 210:c

Index

linguistics
computing
phonetic matching algorithm
Dunedin
metaphone
Caversham Project
University of Otago
New Zealand
Dunedin
Soundex
New York State Identification and Intelligence System
Match rating approach
Metaphone
Cologne phonetics
Professional Android Sensor Programming
ISBN
9781118240458
CiteSeerX
10.1.1.127.5111


"Caverphone"
National Institute of Standards and Technology
Caversham Project
Dunedin
Original (2002) Caverphone algorithm
Revised (2004) Caverphone algorithm
C# Revised Implementation
Apache Commons Codec
PHP implementation

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.