Knowledge (XXG)

Open addressing

Source 📝

109:; that is, the proportion of the slots in the array that are used. As the load factor increases towards 100%, the number of probes that may be required to find or insert a given key rises dramatically. Once the table becomes full, probing algorithms may even fail to terminate. Even with good hash functions, load factors are normally limited to 80%. A poor hash function can exhibit poor performance even at very low load factors by generating significant clustering, especially with the simplest linear addressing method. Generally typical load factors with most open addressing methods are 50%, whilst 20: 346:(1) remove method above is only possible in linearly probed hash tables with single-slot stepping. In the case where many records are to be deleted in one operation, marking the slots for deletion and later rebuilding may be more efficient. 83:, while double hashing has poor cache performance but exhibits virtually no clustering; quadratic probing falls in-between in both areas. Double hashing can also require more computation than other forms of probing. 306:
For all records in a cluster, there must be no vacant slots between their natural hash position and their current position (else lookups will terminate before finding the record). At this point in the pseudocode,
335:
Another technique for removal is simply to mark the slot as deleted. However this eventually requires rebuilding the table simply to remove deleted records. The methods above provide
125:
is an implementation of an open addressing hash table with linear probing and single-slot stepping, a common approach that is effective if the hash function is good. Each of the
46:) until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table. Well-known probe sequences include: 299:
mark slot as occupied slot.key := slot.key slot.value := slot.value mark slot as unoccupied i := j
102:
move existing keys around in the array to make room for the new key. This gives better maximum search times than the methods based on probing.
384: 398:
Poblete; Viola; Munro. "The Analysis of a Hashing Scheme by the Diagonal Poisson Transform". p. 95 of Jan van Leeuwen (Ed.)
339:(1) updating and removal of existing records, with occasional rebuilding if the high-water mark of the table size grows. 231:
operation to insert all the elements of the old array into the new larger array. It is common to increase the array size
63:
in which the interval between probes increases quadratically (hence, the indices are described by a quadratic function).
95: 454:, in Dictionary of Algorithms and Data Structures , Vreda Pieterse and Paul E. Black, eds. 17 September 2015. 441:, in Dictionary of Algorithms and Data Structures , Vreda Pieterse and Paul E. Black, eds. 17 September 2015. 323:
would naturally land in the hash table if there were no collisions. This test is asking if the record at
220:
i := find_slot(key) mark slot as occupied slot.key := key slot.value := value
76: 71:
in which the interval between probes is fixed for each record but is computed by another hash function.
438: 311:
is a vacant slot that might be invalidating this property for subsequent records in the cluster.
232: 80: 380: 87: 58: 469: 426:"Robin Hood Hashing really has constant average search cost and variance in full tables" 451: 164:(slot is occupied) and (slot.key ≠ key) i := (i + 1) modulo num_slots 99: 66: 50: 327:
is invalidly positioned with respect to the required properties of a cluster now that
463: 355: 227:
Rebuilding the table requires allocating a larger array and recursively using the
412: 399: 75:
The main trade offs between these methods are that linear probing has the best
122: 110: 91: 35: 105:
A critical influence on performance of an open addressing hash table is the
141:
to locate the array slot that either does or should contain a given key.
375:
Tenenbaum, Aaron M.; Langsam, Yedidyah; Augenstein, Moshe J. (1990),
19: 425: 18: 358:– a method of deleting from a hash table using open addressing. 55:
in which the interval between probes is fixed — often set to 1.
42:, or searching through alternative locations in the array (the 159:// search until we either find the key, or find an empty slot. 157:
find_slot(key) i := hash(key) modulo num_slots
216:
the table is almost full rebuild the table larger
23:
Hash collision resolved by linear probing (interval=1).
413:"Efficient C/C++ Programming: Smaller, Faster, Better" 147:
pair { key, value, occupied flag (initially unset) }
38:. With this method a hash collision is resolved by 276:// i > j: |.k..j i....| or |....j i..k.| 268:k := hash(slot.key) modulo num_slots 201:set(key, value) i := find_slot(key) 242:remove(key) i := find_slot(key) 235:, for example by doubling the old array size. 174:lookup(key) i := find_slot(key) 8: 379:, Prentice Hall, pp. 456–461, pp. 472, 253:mark slot as unoccupied j := i 260:j := (j + 1) modulo num_slots 270:// determine if k lies cyclically in (i,j] 137:functions use a common internal function 367: 285:(i < k) and (k ≤ j) 295:(k ≤ j) or (i < k) 86:Some open addressing methods, such as 7: 424:Patricio V. Poblete, Alfredo Viola. 319:is the raw hash where the record at 36:collision resolution in hash tables 16:Hash collision resolution technique 14: 209:slot.value := value 439:"Last-Come First-Served Hashing" 264:slot is unoccupied 113:typically can use up to 100%. 96:last-come-first-served hashing 1: 315:is such a subsequent record. 273:// i ≤ j: | i..k..j | 246:slot is unoccupied 151:pair slot, slot, ..., slot 486: 251:// key is not in the table 79:but is most sensitive to 377:Data Structures Using C 400:"Algorithms - ESA '94" 190:// key is not in table 24: 22: 452:"Robin Hood hashing" 207:// we found our key 205:slot is occupied 178:slot is occupied 281:i ≤ j 180:// key is in table 117:Example pseudocode 92:Robin Hood hashing 25: 111:separate chaining 88:Hopscotch hashing 77:cache performance 59:Quadratic probing 34:, is a method of 477: 455: 448: 442: 435: 429: 422: 416: 409: 403: 396: 390: 389: 372: 485: 484: 480: 479: 478: 476: 475: 474: 460: 459: 458: 450:Paul E. Black, 449: 445: 437:Paul E. Black, 436: 432: 423: 419: 410: 406: 397: 393: 387: 374: 373: 369: 365: 352: 330: 326: 322: 318: 314: 310: 300: 221: 196: 185:slot.value 169: 152: 119: 28:Open addressing 17: 12: 11: 5: 483: 481: 473: 472: 462: 461: 457: 456: 443: 430: 417: 411:Steve Heller. 404: 391: 385: 366: 364: 361: 360: 359: 351: 348: 333: 332: 328: 324: 320: 316: 312: 308: 304: 238: 237: 236: 225: 197: 170: 153: 143: 121:The following 118: 115: 100:cuckoo hashing 73: 72: 69: 67:Double hashing 64: 61: 56: 53: 51:Linear probing 44:probe sequence 32:closed hashing 15: 13: 10: 9: 6: 4: 3: 2: 482: 471: 468: 467: 465: 453: 447: 444: 440: 434: 431: 427: 421: 418: 414: 408: 405: 401: 395: 392: 388: 386:0-13-199746-7 382: 378: 371: 368: 362: 357: 356:Lazy deletion 354: 353: 349: 347: 345: 340: 338: 305: 302: 301: 298: 297:continue loop 294: 291: 288: 287:continue loop 284: 280: 277: 274: 271: 267: 263: 259: 256: 252: 249: 245: 241: 234: 233:exponentially 230: 226: 223: 222: 219: 215: 212: 208: 204: 200: 194: 191: 188: 184: 181: 177: 173: 167: 163: 160: 156: 150: 146: 142: 140: 136: 132: 128: 124: 116: 114: 112: 108: 103: 101: 97: 93: 89: 84: 82: 78: 70: 68: 65: 62: 60: 57: 54: 52: 49: 48: 47: 45: 41: 37: 33: 29: 21: 446: 433: 420: 415:2014. p. 33. 407: 394: 376: 370: 343: 341: 336: 334: 296: 292: 289: 286: 282: 278: 275: 272: 269: 265: 261: 257: 254: 250: 247: 243: 239: 228: 217: 213: 210: 206: 202: 198: 192: 189: 186: 182: 179: 175: 171: 165: 161: 158: 154: 148: 144: 138: 134: 130: 126: 120: 106: 104: 85: 74: 43: 39: 31: 27: 26: 107:load factor 363:References 331:is vacant. 195:not found 123:pseudocode 81:clustering 266:exit loop 139:find_slot 464:Category 350:See also 258:(note 2) 240:function 218:(note 1) 199:function 172:function 155:function 470:Hashing 428:. 2016. 402:. 1994. 40:probing 383:  303:note 2 248:return 224:note 1 211:return 193:return 183:return 166:return 145:record 135:remove 127:lookup 162:while 30:, or 381:ISBN 342:The 290:else 255:loop 187:else 133:and 98:and 229:set 149:var 131:set 466:: 293:if 283:if 279:if 262:if 244:if 214:if 203:if 176:if 168:i 129:, 94:, 90:, 344:O 337:O 329:i 325:j 321:j 317:k 313:j 309:i

Index


collision resolution in hash tables
Linear probing
Quadratic probing
Double hashing
cache performance
clustering
Hopscotch hashing
Robin Hood hashing
last-come-first-served hashing
cuckoo hashing
separate chaining
pseudocode
exponentially
Lazy deletion
ISBN
0-13-199746-7
"Algorithms - ESA '94"
"Efficient C/C++ Programming: Smaller, Faster, Better"
"Robin Hood Hashing really has constant average search cost and variance in full tables"
"Last-Come First-Served Hashing"
"Robin Hood hashing"
Category
Hashing

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.