blob: be2fe402121eb7c13338da7cc3438dc831632fc4 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
|
This is a simple hash table implementation written in plain old C. The goal
is for this code to be reusable in many of the projects I've worked on that
could use something better than a poorly-tuned chaining implementation.
The intention is that users of this code copy it directly into their
repositories, as it's quite small and should see very little development.
The summary of this implementation would be that it uses open addressing
with rehashing. The table stores for each entry:
* uint32_t hash of the key.
* pointer to the key.
* pointer to the data.
Inserts occur at key->hash % hash->size. When an insert collides, the insert
reattempts at (key->hash % hash->size + hash->reprobe) % hash->size, and
onwards at increments of reprobe until a free or dead entry is found.
When searching, the search starts at key % hash_size and continues at
increments of reprobe as with inserts, until the matching entry or an
unallocated entry is found.
When deleting an entry, the entry is marked deleted.
Performance considerations:
* Only an extra 10% free entries is given at any table size.
This means that as entries are added, the performance of insertion and
lookups will degrade as one approaches maximum entries until the table
gets resized. Unless an outside entry manager results in a maximum
number of entries close to the hash table's current size limit, this
shouldn't be a concern.
* Repeated deletions fill the table with deleted entry markers.
This means that a table that was filled, then emptied, will have
performance for unsuccessful searches in O(hash->size)
This is worked around in practice by later inserts into a hash table
with many deletes in it triggering a rehash at the current size.
* The data pointer increases space consumption for the hash table by around
50%
For some applications, such as tracking a set, the data pointer can
be removed from the interface and code relatively easily.
In addition to the core hash_table implementation, a sample of the FNV-1a
32-bit hash function is included for convenience for those that don't wish
to analyze hash functions on their own.
|