Hash Table Algorithms: Efficient Key-Value Storage Explained with Examples

Hash Tables are one of the most powerful and widely used data structures in computer science. They provide a highly efficient way to store and retrieve key-value pairs, allowing operations such as search, insert, and delete to be performed in average O(1) time complexity. This article explores Hash Table Algorithms in detail, covering their working principles, collision handling, advantages, limitations, and real-world implementation.

Table of Contents

What is a Hash Table?

A hash table is a data structure used to map keys to values. Internally, it uses a hash function that converts the key into an index in an array, where the value is stored. Think of it as a super-fast dictionary where you can look up data by key instantly.

How Hashing Works

Hashing is the process of converting a large key into a smaller value called a hash code. The hash code is then mapped to an index in an array. For example, if the array size is 10 and a key generates hash code 12345, the index will be 12345 % 10 = 5.

Steps in Hash Table Storage

Apply the hash function to the key.
Reduce the hash code to a valid array index using modulus operation.
Store the value at that index in the array.

Collision Handling Techniques

When two keys map to the same index, it is called a collision. Hash tables must handle collisions efficiently to maintain performance. Let’s look at popular methods:

1. Chaining (Separate Chaining)

In chaining, each array index stores a linked list (or another dynamic structure) of values that hash to the same index.


# Python Example: Chaining in Hash Table
class HashTableChaining:
    def __init__(self, size):
        self.size = size
        self.table = [[] for _ in range(size)]
    
    def _hash(self, key):
        return hash(key) % self.size
    
    def insert(self, key, value):
        index = self._hash(key)
        for item in self.table[index]:
            if item[0] == key:
                item[1] = value
                return
        self.table[index].append([key, value])
    
    def search(self, key):
        index = self._hash(key)
        for item in self.table[index]:
            if item[0] == key:
                return item[1]
        return None

ht = HashTableChaining(5)
ht.insert("Alice", 30)
ht.insert("Bob", 25)
print(ht.search("Alice"))  # Output: 30
print(ht.search("Charlie"))  # Output: None

2. Open Addressing

Instead of linked lists, open addressing places the collided item in another empty slot in the array. Common strategies include:

Linear Probing: Search sequentially for the next empty slot.
Quadratic Probing: Search slots based on a quadratic function.
Double Hashing: Use a second hash function to resolve collisions.


# Python Example: Open Addressing with Linear Probing
class HashTableLinear:
    def __init__(self, size):
        self.size = size
        self.table = [None] * size
    
    def _hash(self, key):
        return hash(key) % self.size
    
    def insert(self, key, value):
        index = self._hash(key)
        start_index = index
        while self.table[index] is not None:
            if self.table[index][0] == key:
                self.table[index] = (key, value)
                return
            index = (index + 1) % self.size
            if index == start_index:
                raise Exception("Hash Table is full")
        self.table[index] = (key, value)
    
    def search(self, key):
        index = self._hash(key)
        start_index = index
        while self.table[index] is not None:
            if self.table[index][0] == key:
                return self.table[index][1]
            index = (index + 1) % self.size
            if index == start_index:
                break
        return None

ht = HashTableLinear(5)
ht.insert("Alice", 30)
ht.insert("Bob", 25)
print(ht.search("Bob"))   # Output: 25

Time and Space Complexity of Hash Tables

Operation	Average Case	Worst Case
Insertion	O(1)	O(n)
Search	O(1)	O(n)
Deletion	O(1)	O(n)

Real-World Applications of Hash Tables

Dictionaries/Maps: Used in programming languages for quick data lookup.
Databases: Indexing for fast retrieval.
Caches: Store recent computations for fast reuse.
Compilers: Used in symbol tables for variable and function storage.

Interactive Example of Hashing in Python

You can interactively test a simple hash table with chaining:


# Run interactively in Python
ht = HashTableChaining(10)
ht.insert("name", "Alice")
ht.insert("age", 25)
ht.insert("country", "India")

for key in ["name", "age", "city"]:
    print(f"{key}: {ht.search(key)}")

Expected Output:

name: Alice
age: 25
city: None

Advantages and Limitations

Advantages

Very fast lookups and insertions (average O(1)).
Used extensively in real-world systems (databases, caches, libraries).
Supports dynamic and unstructured data well.

Limitations

Worst-case time complexity can degrade to O(n) due to collisions.
Inefficient if poorly sized or hash function is not well designed.
Not memory efficient when load factor is high or collisions are frequent.

Conclusion

Hash Table algorithms enable extremely fast key-value storage by applying hashing and efficient collision handling. Whether using chaining or open addressing, hash tables form the backbone of programming languages, databases, and caching mechanisms. Mastering their working is essential for any software engineer working on performance-critical applications.