Hash Tables
Hash tables are a data structure that stores key-value pairs. They are used to implement associative arrays, sets, and caches. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
graph LR A("Key: 'apple'") -->|Hash Function| B("Index: 2") B -->|Maps to| I("Value: 'green'") C("Key: 'banana'") -->|Hash Function| D("Index: 4") D -->|Maps to| J("Value: 'yellow'") E("Key: 'grape'") -->|Hash Function| F("Index: 2") F -->|Maps to| K("Value: 'purple', Collision Resolution") G("Key: 'orange'") -->|Hash Function| H("Index: 3") H -->|Maps to| L("Value: 'orange'")
Overview of Hash Tables
-
Key-Value Pairs: Hash tables store data in a key-value pair format. Each value is associated with a unique key, which is used to access the corresponding value.
-
Hash Function: A hash function is used to map keys to positions (indices) in the hash table. An ideal hash function is fast, distributes keys uniformly across the table, and minimizes collisions (different keys hashing to the same index).
-
Handling Collisions: Common methods to handle collisions include:
- Chaining: Each table index points to a list of entries that map to the same index.
- Open Addressing: All elements are stored within the array itself. Popular methods include linear probing, quadratic probing, and double hashing.
- Dynamic Resizing: To maintain efficient operations, hash tables may resize and rehash all entries, especially when the load factor (number of entries/table size) grows too large.
Common Operations and Their Big O Notation
Operation | Average Case | Worst Case | Comments |
---|---|---|---|
Search | O(1) | O(n) | Worst-case occurs with many collisions. |
Insertion | O(1) | O(n) | Same as search; affected by collisions and load factor. |
Deletion | O(1) | O(n) | Again, collisions can lead to a linear time complexity. |
Access | O(1) | O(n) | Direct access by key is typically constant time. |
Advantages of Hash Tables
- Efficiency: Offers constant time complexity for addition, deletion, and search operations on average.
- Direct Access: Values can be accessed directly using keys, making operations fast and efficient.
Disadvantages of Hash Tables
- Collision Handling: Requires extra handling, and performance can degrade with poor collision resolution or a high load factor.
- Order: Hash tables do not maintain the order of elements.
- Memory Usage: Some hash table implementations can be more memory-intensive than other data structures.
Applications
- Implementing associative arrays or maps.
- Database indexing.
- Caching (e.g., in web browsers).
- Tracking unique items.
Code representation
// HashTable represents a basic hash table.
type HashTable struct {
table []KeyValue
size int
}
// KeyValue represents a key-value pair in the hash table.
type KeyValue struct {
key string
value interface{}
}
// NewHashTable creates a new empty hash table with the specified size.
func NewHashTable(size int) *HashTable {
return &HashTable{
table: make([]KeyValue, size),
size: size,
}
}
// HashFunction calculates the hash code for a given key.
func (ht *HashTable) HashFunction(key string) int {
// Simple hash function (modulus)
hash := 0
for _, char := range key {
hash += int(char)
}
return hash % ht.size
}
// Put inserts a key-value pair into the hash table.
func (ht *HashTable) Put(key string, value interface{}) {
index := ht.HashFunction(key)
for ht.table[index].key != "" {
// Linear probing to handle collisions
index = (index + 1) % ht.size
}
ht.table[index] = KeyValue{key, value}
}
// Get retrieves the value associated with a key from the hash table.
func (ht *HashTable) Get(key string) (interface{}, bool) {
index := ht.HashFunction(key)
for ht.table[index].key != "" {
if ht.table[index].key == key {
return ht.table[index].value, true
}
index = (index + 1) % ht.size
}
return nil, false
}
The above code is a simple implementation of a hash table in Go. It uses a basic hash function (modulus) and linear probing to handle collisions.