Hash Tables

A Hash Table (also called Hash Map) is a data structure that maps keys to values for highly efficient lookups. By using a hash function to compute array indices, hash tables achieve average O(1) time complexity for search, insertion, and deletion operations.

Interactive Hash Table Visualization

Watch keys move through the hash function, see collisions resolved through separate chaining, and observe how search operations traverse chains to find values.

Hash Table Operations (Separate Chaining)

Insert Key-Value Pair

Add or update a key-value pair

Search Key

Find a key in the hash table

Animation Speed: 1000ms

Hash Table Visualization

Hash Function

Sum of character codes % 7

hash(key) = Σ(char codes) % 7

Bucket 0

Empty

Bucket 1

Empty

Bucket 2

Empty

Bucket 3

Empty

Bucket 4

Empty

Bucket 5

Empty

Bucket 6

Empty

Total Items: 0 | Load Factor: 0.00 | Buckets: 7

How Hash Tables Work

Hash Function:

Converts a key into an array index. Our simple function sums character codes and uses modulo to ensure the result fits in our 7-bucket array.

Separate Chaining: When multiple keys hash to the same index (collision), we store them in a linked list within that bucket.

Performance:

Average: O(1) for insert, search, and delete operations.

Worst case: O(n) when all keys hash to the same bucket.

Load Factor: 0.00 (items/buckets). Keep below 0.75 for good performance.

Core Components of Hash Tables

Array of Buckets

The underlying storage mechanism:

• Fixed-size array of storage locations
• Each bucket can hold one or more key-value pairs
• Index computed by the hash function
• Direct access to any bucket in O(1) time

Hash Function

The key to efficient mapping:

• Converts keys into array indices
• Deterministic: same key → same index
• Uniform distribution minimizes collisions
• Fast computation for performance

Collision Handling

When multiple keys map to same index:

• Inevitable: more keys than buckets
• Separate Chaining: linked lists in buckets
• Open Addressing: find alternative slots
• Strategy choice affects performance

Hash Functions & Their Importance

What Makes a Good Hash Function?

• Uniform Distribution: Spreads keys evenly across buckets
• Deterministic: Always produces the same result for the same input
• Fast Computation: Should be computed quickly (O(1) or O(k))
• Avalanche Effect: Small input changes cause large output changes
• Low Collision Rate: Minimizes keys mapping to same index

Common Hash Functions:

Division Method: h(k) = k mod m

Multiplication Method: h(k) = ⌊m(kA mod 1)⌋

String Hashing: Σ(char_code * base^i) mod m

Load Factor & Performance

The load factor (α) is the ratio of stored elements to total buckets: α = n/m

Low Load Factor (α < 0.5)

Few collisions, excellent performance, but wastes memory

Moderate Load Factor (0.5 ≤ α ≤ 0.75)

Good balance of performance and memory usage (recommended)

High Load Factor (α > 0.75)

Many collisions, degraded performance, time to resize

Separate Chaining Collision Resolution

How Separate Chaining Works

Each bucket contains a linked list (or chain) of elements
Hash function determines which bucket to use
Multiple keys that hash to the same index are stored in the same bucket's chain
Search operation traverses the chain looking for the target key
Insertion adds to the beginning or end of the chain
Deletion removes from the chain and updates links

Advantages:

• Simple to implement and understand
• Handles any number of collisions
• Performance degrades gracefully
• Easy to delete elements

Memory Layout Example

Bucket Array:

0: ["cat":"🐱"] → ["hat":"🎩"] → null

1: null

2: ["dog":"🐶"] → null

3: ["apple":"🍎"] → ["table":"📱"] → null

4: null

5: ["fish":"🐟"] → null

6: ["banana":"🍌"] → null

Considerations:

• Extra memory overhead for pointers
• Cache performance may suffer with long chains
• Worst case: all keys hash to same bucket → O(n)

Complexity Analysis

Operation	Average Case	Worst Case	Space	Explanation
Search	O(1)	O(n)	O(1)	Hash to bucket + traverse chain (avg length = α)
Insertion	O(1)	O(n)	O(1)	Hash + check for duplicates + insert at head/tail
Deletion	O(1)	O(n)	O(1)	Hash + search in chain + remove node
Space Overhead	O(n)	O(n)	-	Bucket array + node pointers in chains

Performance Factors

Average O(1) assumes:

Good hash function with uniform distribution
Reasonable load factor (α ≤ 0.75)
Random key distribution

Worst O(n) occurs when:

All keys hash to the same bucket
Poor hash function or adversarial input
Extremely high load factor

Hash Table Implementation with Separate Chaining

Complete hash table implementations using arrays of linked lists for collision resolution. Study how the hash function maps keys to indices and how chaining handles multiple keys in the same bucket.

 class HashNode:
    """
    Node class for storing key-value pairs in the hash table
    Used in separate chaining collision resolution
    """
    def __init__(self, key, value):
        self.key = key
        self.value = value
        self.next = None

class HashTable:
    """
    Hash Table implementation using separate chaining for collision resolution
    Uses an array of linked lists (buckets) to handle collisions
    """
    
    def __init__(self, initial_capacity=7):
        self.capacity = initial_capacity
        self.size = 0
        # Initialize array of empty buckets (linked lists)
        self.buckets = [None] * self.capacity
    
    def _hash(self, key):
        """
        Simple hash function: sum of character codes mod capacity
        Time Complexity: O(k) where k is the length of key
        """
        if isinstance(key, int):
            return key % self.capacity
        
        hash_value = 0
        for char in str(key):
            hash_value += ord(char)
        return hash_value % self.capacity
    
    def insert(self, key, value):
        """
        Insert a key-value pair into the hash table
        Time Complexity: O(1) average, O(n) worst case
        Space Complexity: O(1)
        """
        index = self._hash(key)
        
        # If bucket is empty, create first node
        if self.buckets[index] is None:
            self.buckets[index] = HashNode(key, value)
            self.size += 1
            return
        
        # Traverse the chain to check for existing key or find end
        current = self.buckets[index]
        while current:
            # Update existing key
            if current.key == key:
                current.value = value
                return
            
            # Reached end of chain, add new node
            if current.next is None:
                current.next = HashNode(key, value)
                self.size += 1
                return
            
            current = current.next
    
    def search(self, key):
        """
        Search for a key in the hash table
        Time Complexity: O(1) average, O(n) worst case
        Space Complexity: O(1)
        """
        index = self._hash(key)
        current = self.buckets[index]
        
        # Traverse the chain looking for the key
        while current:
            if current.key == key:
                return current.value
            current = current.next
        
        # Key not found
        raise KeyError(f"Key '{key}' not found in hash table")
    
    def delete(self, key):
        """
        Delete a key-value pair from the hash table
        Time Complexity: O(1) average, O(n) worst case
        Space Complexity: O(1)
        """
        index = self._hash(key)
        current = self.buckets[index]
        
        # Empty bucket
        if current is None:
            raise KeyError(f"Key '{key}' not found in hash table")
        
        # Key is in the first node
        if current.key == key:
            self.buckets[index] = current.next
            self.size -= 1
            return current.value
        
        # Search in the chain
        while current.next:
            if current.next.key == key:
                deleted_value = current.next.value
                current.next = current.next.next
                self.size -= 1
                return deleted_value
            current = current.next
        
        # Key not found
        raise KeyError(f"Key '{key}' not found in hash table")
    
    def get_load_factor(self):
        """Calculate the load factor (size / capacity)"""
        return self.size / self.capacity
    
    def display(self):
        """Display the hash table structure for debugging"""
        print(f"Hash Table (Size: {self.size}, Capacity: {self.capacity})")
        print(f"Load Factor: {self.get_load_factor():.2f}")
        
        for i, head in enumerate(self.buckets):
            print(f"Bucket {i}: ", end="")
            current = head
            while current:
                print(f"({current.key}: {current.value})", end="")
                if current.next:
                    print(" -> ", end="")
                current = current.next
            print(" -> None" if head else "Empty")
    
    def __len__(self):
        return self.size
    
    def __contains__(self, key):
        try:
            self.search(key)
            return True
        except KeyError:
            return False

# Example usage and testing
if __name__ == "__main__":
    # Create hash table
    ht = HashTable()
    
    # Test insertions
    ht.insert("apple", "A red fruit")
    ht.insert("banana", "A yellow fruit")
    ht.insert("cat", "A feline pet")
    ht.insert("dog", "A canine pet")
    ht.insert("elephant", "A large mammal")
    
    print("After insertions:")
    ht.display()
    
    # Test searches
    try:
        print(f"\nSearching for 'apple': {ht.search('apple')}")
        print(f"Searching for 'cat': {ht.search('cat')}")
        print(f"'dog' in hash table: {'dog' in ht}")
    except KeyError as e:
        print(f"Search error: {e}")
    
    # Test update
    ht.insert("apple", "A green fruit")  # Update existing key
    print(f"\nAfter updating 'apple': {ht.search('apple')}")
    
    # Test deletion
    try:
        deleted = ht.delete("banana")
        print(f"\nDeleted 'banana': {deleted}")
        ht.display()
    except KeyError as e:
        print(f"Delete error: {e}") 
 import java.util.*;

/**
 * Node class for storing key-value pairs in the hash table
 * Used in separate chaining collision resolution
 */
class HashNode<K, V> {
    K key;
    V value;
    HashNode<K, V> next;
    
    public HashNode(K key, V value) {
        this.key = key;
        this.value = value;
        this.next = null;
    }
}

/**
 * Hash Table implementation using separate chaining for collision resolution
 * Uses an array of linked lists (buckets) to handle collisions
 */
public class HashTable<K, V> {
    private HashNode<K, V>[] buckets;
    private int capacity;
    private int size;
    
    /**
     * Constructor with default capacity
     */
    @SuppressWarnings("unchecked")
    public HashTable(int initialCapacity) {
        this.capacity = initialCapacity;
        this.size = 0;
        // Initialize array of empty buckets
        this.buckets = new HashNode[capacity];
    }
    
    public HashTable() {
        this(7); // Default capacity
    }
    
    /**
     * Simple hash function: hashCode mod capacity
     * Time Complexity: O(1) for most objects
     */
    private int hash(K key) {
        if (key == null) {
            return 0;
        }
        return Math.abs(key.hashCode()) % capacity;
    }
    
    /**
     * Insert a key-value pair into the hash table
     * Time Complexity: O(1) average, O(n) worst case
     * Space Complexity: O(1)
     */
    public void insert(K key, V value) {
        int index = hash(key);
        
        // If bucket is empty, create first node
        if (buckets[index] == null) {
            buckets[index] = new HashNode<>(key, value);
            size++;
            return;
        }
        
        // Traverse the chain to check for existing key or find end
        HashNode<K, V> current = buckets[index];
        while (current != null) {
            // Update existing key
            if (current.key.equals(key)) {
                current.value = value;
                return;
            }
            
            // Reached end of chain, add new node
            if (current.next == null) {
                current.next = new HashNode<>(key, value);
                size++;
                return;
            }
            
            current = current.next;
        }
    }
    
    /**
     * Search for a key in the hash table
     * Time Complexity: O(1) average, O(n) worst case
     * Space Complexity: O(1)
     */
    public V search(K key) throws NoSuchElementException {
        int index = hash(key);
        HashNode<K, V> current = buckets[index];
        
        // Traverse the chain looking for the key
        while (current != null) {
            if (current.key.equals(key)) {
                return current.value;
            }
            current = current.next;
        }
        
        // Key not found
        throw new NoSuchElementException("Key '" + key + "' not found in hash table");
    }
    
    /**
     * Delete a key-value pair from the hash table
     * Time Complexity: O(1) average, O(n) worst case
     * Space Complexity: O(1)
     */
    public V delete(K key) throws NoSuchElementException {
        int index = hash(key);
        HashNode<K, V> current = buckets[index];
        
        // Empty bucket
        if (current == null) {
            throw new NoSuchElementException("Key '" + key + "' not found in hash table");
        }
        
        // Key is in the first node
        if (current.key.equals(key)) {
            buckets[index] = current.next;
            size--;
            return current.value;
        }
        
        // Search in the chain
        while (current.next != null) {
            if (current.next.key.equals(key)) {
                V deletedValue = current.next.value;
                current.next = current.next.next;
                size--;
                return deletedValue;
            }
            current = current.next;
        }
        
        // Key not found
        throw new NoSuchElementException("Key '" + key + "' not found in hash table");
    }
    
    /**
     * Check if the hash table contains a key
     */
    public boolean containsKey(K key) {
        try {
            search(key);
            return true;
        } catch (NoSuchElementException e) {
            return false;
        }
    }
    
    /**
     * Get the current size of the hash table
     */
    public int size() {
        return size;
    }
    
    /**
     * Check if the hash table is empty
     */
    public boolean isEmpty() {
        return size == 0;
    }
    
    /**
     * Calculate the load factor (size / capacity)
     */
    public double getLoadFactor() {
        return (double) size / capacity;
    }
    
    /**
     * Display the hash table structure for debugging
     */
    public void display() {
        System.out.println("Hash Table (Size: " + size + ", Capacity: " + capacity + ")");
        System.out.printf("Load Factor: %.2f%n", getLoadFactor());
        
        for (int i = 0; i < capacity; i++) {
            System.out.print("Bucket " + i + ": ");
            HashNode<K, V> current = buckets[i];
            
            if (current == null) {
                System.out.println("Empty");
            } else {
                while (current != null) {
                    System.out.print("(" + current.key + ": " + current.value + ")");
                    if (current.next != null) {
                        System.out.print(" -> ");
                    }
                    current = current.next;
                }
                System.out.println(" -> null");
            }
        }
    }
    
    // Example usage
    public static void main(String[] args) {
        HashTable<String, String> ht = new HashTable<>();
        
        // Test insertions
        ht.insert("apple", "A red fruit");
        ht.insert("banana", "A yellow fruit");
        ht.insert("cat", "A feline pet");
        ht.insert("dog", "A canine pet");
        ht.insert("elephant", "A large mammal");
        
        System.out.println("After insertions:");
        ht.display();
        
        // Test searches
        try {
            System.out.println("\nSearching for 'apple': " + ht.search("apple"));
            System.out.println("Searching for 'cat': " + ht.search("cat"));
            System.out.println("'dog' in hash table: " + ht.containsKey("dog"));
        } catch (NoSuchElementException e) {
            System.out.println("Search error: " + e.getMessage());
        }
        
        // Test update
        ht.insert("apple", "A green fruit"); // Update existing key
        System.out.println("\nAfter updating 'apple': " + ht.search("apple"));
        
        // Test deletion
        try {
            String deleted = ht.delete("banana");
            System.out.println("\nDeleted 'banana': " + deleted);
            ht.display();
        } catch (NoSuchElementException e) {
            System.out.println("Delete error: " + e.getMessage());
        }
    }
} 
 #include <iostream>
#include <iomanip>
#include <string>
#include <vector>
#include <functional>
using namespace std;

/**
 * Node class for storing key-value pairs in the hash table
 * Used in separate chaining collision resolution
 */
template <typename K, typename V>
struct HashNode {
    K key;
    V value;
    HashNode* next;
    
    HashNode(const K& k, const V& v) : key(k), value(v), next(nullptr) {}
};

/**
 * Hash Table implementation using separate chaining for collision resolution
 * Uses an array of linked lists (buckets) to handle collisions
 */
template <typename K, typename V>
class HashTable {
private:
    vector<HashNode<K, V>*> buckets;
    int capacity;
    int size;
    
    /**
     * Simple hash function using std::hash
     * Time Complexity: O(1) for most types
     */
    int hash(const K& key) {
        return std::hash<K>{}(key) % capacity;
    }
    
public:
    /**
     * Constructor with specified capacity
     */
    HashTable(int initialCapacity = 7) : capacity(initialCapacity), size(0) {
        buckets.resize(capacity, nullptr);
    }
    
    /**
     * Destructor to free allocated memory
     */
    ~HashTable() {
        clear();
    }
    
    /**
     * Copy constructor
     */
    HashTable(const HashTable& other) : capacity(other.capacity), size(0) {
        buckets.resize(capacity, nullptr);
        
        for (int i = 0; i < capacity; i++) {
            HashNode<K, V>* current = other.buckets[i];
            while (current != nullptr) {
                insert(current->key, current->value);
                current = current->next;
            }
        }
    }
    
    /**
     * Assignment operator
     */
    HashTable& operator=(const HashTable& other) {
        if (this != &other) {
            clear();
            capacity = other.capacity;
            size = 0;
            buckets.resize(capacity, nullptr);
            
            for (int i = 0; i < capacity; i++) {
                HashNode<K, V>* current = other.buckets[i];
                while (current != nullptr) {
                    insert(current->key, current->value);
                    current = current->next;
                }
            }
        }
        return *this;
    }
    
    /**
     * Insert a key-value pair into the hash table
     * Time Complexity: O(1) average, O(n) worst case
     * Space Complexity: O(1)
     */
    void insert(const K& key, const V& value) {
        int index = hash(key);
        
        // If bucket is empty, create first node
        if (buckets[index] == nullptr) {
            buckets[index] = new HashNode<K, V>(key, value);
            size++;
            return;
        }
        
        // Traverse the chain to check for existing key or find end
        HashNode<K, V>* current = buckets[index];
        while (current != nullptr) {
            // Update existing key
            if (current->key == key) {
                current->value = value;
                return;
            }
            
            // Reached end of chain, add new node
            if (current->next == nullptr) {
                current->next = new HashNode<K, V>(key, value);
                size++;
                return;
            }
            
            current = current->next;
        }
    }
    
    /**
     * Search for a key in the hash table
     * Time Complexity: O(1) average, O(n) worst case
     * Space Complexity: O(1)
     */
    V search(const K& key) {
        int index = hash(key);
        HashNode<K, V>* current = buckets[index];
        
        // Traverse the chain looking for the key
        while (current != nullptr) {
            if (current->key == key) {
                return current->value;
            }
            current = current->next;
        }
        
        // Key not found - throw exception
        throw runtime_error("Key not found in hash table");
    }
    
    /**
     * Delete a key-value pair from the hash table
     * Time Complexity: O(1) average, O(n) worst case
     * Space Complexity: O(1)
     */
    V deleteKey(const K& key) {
        int index = hash(key);
        HashNode<K, V>* current = buckets[index];
        
        // Empty bucket
        if (current == nullptr) {
            throw runtime_error("Key not found in hash table");
        }
        
        // Key is in the first node
        if (current->key == key) {
            V deletedValue = current->value;
            buckets[index] = current->next;
            delete current;
            size--;
            return deletedValue;
        }
        
        // Search in the chain
        while (current->next != nullptr) {
            if (current->next->key == key) {
                V deletedValue = current->next->value;
                HashNode<K, V>* nodeToDelete = current->next;
                current->next = current->next->next;
                delete nodeToDelete;
                size--;
                return deletedValue;
            }
            current = current->next;
        }
        
        // Key not found
        throw runtime_error("Key not found in hash table");
    }
    
    /**
     * Check if the hash table contains a key
     */
    bool containsKey(const K& key) {
        try {
            search(key);
            return true;
        } catch (const runtime_error&) {
            return false;
        }
    }
    
    /**
     * Get the current size of the hash table
     */
    int getSize() const {
        return size;
    }
    
    /**
     * Check if the hash table is empty
     */
    bool isEmpty() const {
        return size == 0;
    }
    
    /**
     * Calculate the load factor (size / capacity)
     */
    double getLoadFactor() const {
        return static_cast<double>(size) / capacity;
    }
    
    /**
     * Clear all elements from the hash table
     */
    void clear() {
        for (int i = 0; i < capacity; i++) {
            HashNode<K, V>* current = buckets[i];
            while (current != nullptr) {
                HashNode<K, V>* next = current->next;
                delete current;
                current = next;
            }
            buckets[i] = nullptr;
        }
        size = 0;
    }
    
    /**
     * Display the hash table structure for debugging
     */
    void display() const {
        cout << "Hash Table (Size: " << size << ", Capacity: " << capacity << ")" << endl;
        cout << "Load Factor: " << fixed << setprecision(2) << getLoadFactor() << endl;
        
        for (int i = 0; i < capacity; i++) {
            cout << "Bucket " << i << ": ";
            HashNode<K, V>* current = buckets[i];
            
            if (current == nullptr) {
                cout << "Empty" << endl;
            } else {
                while (current != nullptr) {
                    cout << "(" << current->key << ": " << current->value << ")";
                    if (current->next != nullptr) {
                        cout << " -> ";
                    }
                    current = current->next;
                }
                cout << " -> null" << endl;
            }
        }
    }
};

// Example usage
int main() {
    HashTable<string, string> ht;
    
    // Test insertions
    ht.insert("apple", "A red fruit");
    ht.insert("banana", "A yellow fruit");
    ht.insert("cat", "A feline pet");
    ht.insert("dog", "A canine pet");
    ht.insert("elephant", "A large mammal");
    
    cout << "After insertions:" << endl;
    ht.display();
    
    // Test searches
    try {
        cout << "\nSearching for 'apple': " << ht.search("apple") << endl;
        cout << "Searching for 'cat': " << ht.search("cat") << endl;
        cout << "'dog' in hash table: " << (ht.containsKey("dog") ? "true" : "false") << endl;
    } catch (const runtime_error& e) {
        cout << "Search error: " << e.what() << endl;
    }
    
    // Test update
    ht.insert("apple", "A green fruit"); // Update existing key
    cout << "\nAfter updating 'apple': " << ht.search("apple") << endl;
    
    // Test deletion
    try {
        string deleted = ht.deleteKey("banana");
        cout << "\nDeleted 'banana': " << deleted << endl;
        ht.display();
    } catch (const runtime_error& e) {
        cout << "Delete error: " << e.what() << endl;
    }
    
    return 0;
} 

Real-World Applications & Advanced Topics

Database Indexing

Hash indexes provide O(1) lookups for exact-match queries in database systems.

Caching Systems

In-memory caches like Redis use hash tables for fast key-value storage and retrieval.

Programming Languages

Built-in dictionaries (Python), objects (JavaScript), and maps (Java) are hash table implementations.

Symbol Tables

Compilers use hash tables to store variable names, function definitions, and scoping information.

Distributed Systems

Consistent hashing distributes data across nodes in distributed databases and load balancers.

Cryptography

Cryptographic hash functions provide data integrity, digital signatures, and blockchain proof-of-work.

Advanced Hash Table Concepts

Dynamic Resizing: Automatically resize the table when load factor exceeds threshold, rehashing all elements.

Robin Hood Hashing: Open addressing variant that minimizes variance in probe distances.

Cuckoo Hashing: Guarantees O(1) worst-case lookup time using two hash functions.

Bloom Filters: Space-efficient probabilistic data structure for set membership testing.

Consistent Hashing: Distributes keys across changing sets of servers with minimal rehashing.

Perfect Hashing: Specialized hash functions that guarantee no collisions for static key sets.