Hash Flooding


Reading time: about 3 minutes

blah blah

In order to judge the distribution, I’ll use the standard deviation.

In [2]:
statistics.stdev([5, 5, 6, 5, 5])
statistics.stdev([4, 4, 10, 4, 4])
Out [2]:
0.4472135954999579
Out [2]:
2.6832815729997477
In [3]:
def evaluate_hash(fn, values):
    counts = [0] * 256
    
    for val in values:
        h = fn(val)
        counts[h] += 1
    
    return statistics.stdev(counts)
In [4]:
unkeyed_hash("Test message 1")
unkeyed_hash("Test message 2")
unkeyed_hash("Test message 3")
Out [4]:
168
Out [4]:
184
Out [4]:
236

Let’s see how good Python’s built-in hash function is on “normal” input.

In [5]:
values = []

for i in range(5000):
    values.append(f"Test {i}")
    
evaluate_hash(crc32, values)
Out [5]:
1.4656190554459485

Now let’s try to skew it a little by choosing the values more carefully.

In [6]:
values = []

i = 0
while len(values) < 5000:
    val = f"Test {i}"
    if crc32(val) == 5:
        values.append(val)
    i += 1

evaluate_hash(crc32, values)
Out [6]:
312.5

Wow, any data structure or partitioning that expects uniform output from a hash function would be devastated.

In [7]:
hash1 = pearson_with_key("Quentin Coldwater")
hash2 = pearson_with_key("Josh Hoberman")

values = []

i = 0
while len(values) < 5000:
    val = f"Test {i}"
    if hash1(val) == 5:
        values.append(val)
    i += 1
        
evaluate_hash(hash1, values)
Out [7]:
312.5

A similar result, but now let’s try a different key with the same values.

In [8]:
evaluate_hash(hash2, values)
Out [8]:
4.4860349757800115

The following pages link here

Citation

If you find this work useful, please cite it as:
@article{yaltirakliwikihashflooding,
  title   = "Hash Flooding",
  author  = "Yaltirakli, Gokberk",
  journal = "gkbrk.com",
  year    = "2024",
  url     = "https://www.gkbrk.com/wiki/hash-flooding/"
}
Not using BibTeX? Click here for more citation styles.
IEEE Citation
Gokberk Yaltirakli, "Hash Flooding", November, 2024. [Online]. Available: https://www.gkbrk.com/wiki/hash-flooding/. [Accessed Nov. 12, 2024].
APA Style
Yaltirakli, G. (2024, November 12). Hash Flooding. https://www.gkbrk.com/wiki/hash-flooding/
Bluebook Style
Gokberk Yaltirakli, Hash Flooding, GKBRK.COM (Nov. 12, 2024), https://www.gkbrk.com/wiki/hash-flooding/

Comments

© 2024 Gokberk Yaltirakli