Knowledge

Cologne phonetics

Source 📝

102:
The Cologne phonetics matches each letter of a word to a digit between "0" and "8". To select the appropriate digit, at most one adjacent letter is used as a context. Some rules apply specifically to the initials of words. In this way similar sounds are supposed to be assigned the same code. The
89:
which assigns to words a sequence of digits, the phonetic code. The aim of this procedure is that identical sounding words have the same code assigned to them. The algorithm can be used to perform a similarity search between words. For example, it is possible in a name list to find entries like
308:
Lowercase letters are encoded accordingly; all other characters (such as hyphens) are ignored. For the umlauts Ä, Ö, Ü, as well as ß, that are not taken into account in the conversion table, it suggests itself to match them to the vowels (code "0") respective to the group S, Z (code "8").
103:
letters "W" and "V" for example, are both encoded with the number "3". The phonetic code for "Knowledge" is "3412" (W=3, K=4, P=1, and D=2). Unlike the Soundex code, the length of the codes from the Cologne phonetics method is not limited.
305:
in line 10 of the table. This is not explicitly mentioned in the original publication but can be inferred from the examples listed there, e.g. for "Breschnew" the code "17863" is specified.
380: 383:(PDF-Datei; 502 kB). Magisterarbeit an der Philosophischen Fakultät der Universität zu Köln, 2005; enthält eine Implementierung in der Programmiersprache 94:
phonetic algorithm but is optimized to match the German language. The algorithm was published in 1969 by Hans Joachim Postel.
90:"Meier" under different spellings such as "Maier", "Mayer", or "Mayr". The Cologne phonetics is related to the well known 425: 68: 55: 42: 29: 301:
That for the letter "C" the rule "SC" has priority over the rule "CH" was taken into account by the addition of
384: 356:
Die Kölner Phonetik. Ein Verfahren zur Identifizierung von Personennamen auf der Grundlage der Gestaltanalyse.
391: 86: 22: 71: 58: 45: 32: 405: 394:
und ähnlicher Verfahren als freie Software im CPAN (Comprehensive Perl Archive Network)
419: 332: 398: 316:
Encode letter by letter from left to right according to the conversion table.
367: 381:
Aspekte der Kodierung phonetischer Ähnlichkeiten in deutschen Eigennamen
91: 342:
Collapse of all multiple consecutive code digits: 6050750206802
319:
Remove all digits occurring more than once next to each other.
274:
in initial position except before A, H, K, L, O, Q, R, U, X
205:
in the initial sound before A, H, K, L, O, Q, R, U, X
358:
in: IBM-Nachrichten, 19. Jahrgang, 1969, S. 925-931.
67: 54: 41: 28: 18: 312:Processing of a word is done in three steps: 210:before A, H, K, O, Q, U, X except after S, Z 85:(also Kölner Phonetik, Cologne process) is a 8: 322:Remove all code "0" except at the beginning. 392:Perl-Implementierung der Kölner Phonetik 260: 196: 178: 149: 110: 406:PHP-Implementation der Kölner Phonetik 15: 399:PHP und Oracle PL/SQL-Implementierung 339:Encode each letter: 60550750206880022 7: 14: 408:in einem Kommentar zum Eintrag 345:Remove all "0" digits: 65752682 279:not before A, H, K, O, Q, U, X 1: 250: 240: 230: 220: 168: 139: 129: 119: 442: 335:will be coded as follows: 265: 201: 354:Hans Joachim Postel: 412:im PHP-Manual, 2008. 426:Phonetic algorithms 401:der Kölner Phonetik 333:Müller-Lüdenscheidt 303:"except after S, Z" 125:A, E, I, J, O, U, Y 166:not before C, S, Z 87:phonetic algorithm 23:Phonetic algorithm 299: 298: 218:not after C, K, Q 83:Cologne phonetics 80: 79: 433: 404:Nicolas Zimmer: 111: 72:space complexity 16: 441: 440: 436: 435: 434: 432: 431: 430: 416: 415: 376: 364: 352: 329: 287:before C, S, Z 109: 100: 12: 11: 5: 439: 437: 429: 428: 418: 417: 414: 413: 402: 397:Andy Theiler: 395: 390:Maroš Kollár: 388: 375: 374:External links 372: 371: 370: 363: 360: 351: 348: 347: 346: 343: 340: 328: 325: 324: 323: 320: 317: 297: 296: 295:after C, K, Q 293: 289: 288: 285: 281: 280: 276: 275: 271: 270: 267: 263: 262: 259: 257: 253: 252: 249: 247: 243: 242: 239: 237: 233: 232: 229: 227: 223: 222: 219: 216: 212: 211: 207: 206: 203: 199: 198: 195: 193: 189: 188: 185: 181: 180: 177: 175: 171: 170: 167: 164: 160: 159: 156: 152: 151: 148: 146: 142: 141: 138: 136: 132: 131: 128: 126: 122: 121: 118: 115: 108: 105: 99: 96: 78: 77: 74: 65: 64: 61: 52: 51: 48: 39: 38: 35: 26: 25: 20: 13: 10: 9: 6: 4: 3: 2: 438: 427: 424: 423: 421: 411: 407: 403: 400: 396: 393: 389: 386: 382: 379:Martin Wilz: 378: 377: 373: 369: 366: 365: 361: 359: 357: 349: 344: 341: 338: 337: 336: 334: 326: 321: 318: 315: 314: 313: 310: 306: 304: 294: 291: 290: 286: 283: 282: 278: 277: 273: 272: 268: 264: 258: 255: 254: 248: 245: 244: 238: 235: 234: 228: 225: 224: 217: 214: 213: 209: 208: 204: 200: 194: 191: 190: 186: 183: 182: 176: 173: 172: 165: 162: 161: 158:not before H 157: 154: 153: 147: 144: 143: 137: 134: 133: 127: 124: 123: 116: 113: 112: 106: 104: 97: 95: 93: 88: 84: 75: 73: 70: 66: 62: 60: 57: 53: 49: 47: 44: 40: 36: 34: 31: 27: 24: 21: 17: 409: 355: 353: 330: 311: 307: 302: 300: 101: 82: 81: 269:after S, Z 59:performance 46:performance 33:performance 350:Literature 69:Worst-case 30:Worst-case 368:Metaphone 331:The name 187:before H 107:Procedure 43:Best-case 420:Category 362:See also 410:soundex 327:Example 192:G, K, Q 174:F, V, W 117:Context 92:Soundex 56:Average 114:Letter 98:Method 120:Code 19:Class 385:Perl 284:D, T 256:S, Z 236:M, N 163:D, T 76:O(N) 63:O(N) 50:O(N) 37:O(N) 221:48 422:: 261:8 251:7 241:6 231:5 197:4 179:3 169:2 150:1 140:- 130:0 387:. 292:X 266:C 246:R 226:L 215:X 202:C 184:P 155:P 145:B 135:H

Index

Phonetic algorithm
Worst-case
performance
Best-case
performance
Average
performance
Worst-case
space complexity
phonetic algorithm
Soundex
Müller-Lüdenscheidt
Metaphone
Aspekte der Kodierung phonetischer Ähnlichkeiten in deutschen Eigennamen
Perl
Perl-Implementierung der Kölner Phonetik
PHP und Oracle PL/SQL-Implementierung
PHP-Implementation der Kölner Phonetik
Category
Phonetic algorithms

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.