title: “HashMap Insert and Resize Logic in Rust” meta_description: “Implement HashMap insertion with Robin Hood hashing, load factor management, and automatic resize/rehashing in Rust.” keywords: [“rust hashmap insert”, “robin hood insertion”, “hashmap resize”, “rehashing”, “load factor”]

📈 Reinventing from Scratch: Building HashMap

Chapter 3 — “Insert and Resize Logic”

“Insertion is easy. Growing the table without breaking everything? That’s the real challenge.”

🧭 3.1. The Insertion Algorithm

Let’s implement insert() with Robin Hood hashing:

impl<K, V, S> MyHashMap<K, V, S>
where
    K: Hash + Eq,
    S: BuildHasher,
{
    pub fn insert(&mut self, key: K, value: V) -> Option<V> {
        // Check if we need to grow first
        if self.should_grow() {
            self.grow();
        }
        
        let hash = self.hash_key(&key);
        self.insert_entry(hash, key, value)
    }
}

Three steps:

Check if table is too full
Grow if needed
Actually insert the entry

🎯 3.2. The Core Insert Logic

fn insert_entry(&mut self, hash: u64, mut key: K, mut value: V) -> Option<V> {
    let mut ideal = self.ideal_index(hash);
    let mut current_distance = 0;
    let mut current_hash = hash;
    
    loop {
        let index = (ideal + current_distance) % self.capacity();
        
        match &mut self.entries[index] {
            // Empty slot — insert here!
            None => {
                self.entries[index] = Some(Entry {
                    key,
                    value,
                    hash: current_hash,
                });
                self.len += 1;
                return None;
            }
            
            // Existing entry
            Some(existing) => {
                // Same key? Update value!
                if existing.key == key {
                    return Some(std::mem::replace(&mut existing.value, value));
                }
                
                // Robin Hood: Check distances
                let existing_distance = self.probe_distance(
                    existing.hash,
                    index
                );
                
                if current_distance > existing_distance {
                    // We're poorer! Steal this spot!
                    std::mem::swap(&mut key, &mut existing.key);
                    std::mem::swap(&mut value, &mut existing.value);
                    std::mem::swap(&mut current_hash, &mut existing.hash);
                    current_distance = existing_distance;
                    ideal = self.ideal_index(current_hash);
                }
                
                // Continue probing
                current_distance += 1;
            }
        }
    }
}

🎭 3.3. Robin Hood in Action (Visual Example)

Insert “fox” (hash→2, ideal: 2):

Initial:
[0]: Empty
[1]: Empty  
[2]: ("dog", 20)  dist: 0  ← ideal spot
[3]: ("cat", 10)  dist: 1  ← pushed from [2]
[4]: Empty

Step 1: "fox" tries [2], distance would be 0
        "dog" has distance 0 → same, no swap, continue

Step 2: "fox" tries [3], distance would be 1
        "cat" has distance 1 → same, no swap, continue

Step 3: "fox" tries [4], distance would be 2
        Empty! Insert here.

Result:
[2]: ("dog", 20)  dist: 0
[3]: ("cat", 10)  dist: 1
[4]: ("fox", 50)  dist: 2  ← inserted

Now insert “bat” (hash→2, ideal: 2):

Step 1: "bat" tries [2], distance would be 0
        "dog" has distance 0 → no swap

Step 2: "bat" tries [3], distance would be 1
        "cat" has distance 1 → no swap

Step 3: "bat" tries [4], distance would be 2
        "fox" has distance 2 → no swap

Step 4: "bat" tries [5], distance would be 3
        Empty! Insert here.

Result:
[2]: ("dog", 20)  dist: 0
[3]: ("cat", 10)  dist: 1
[4]: ("fox", 50)  dist: 2
[5]: ("bat", 30)  dist: 3  ← inserted

But now “elk” (hash→3, ideal: 3):

Step 1: "elk" tries [3], distance would be 0
        "cat" has distance 1
        → "elk" is RICHER! Swap!

Step 2: Now inserting "cat", tries [4], distance would be 1
        "fox" has distance 2
        → "cat" is RICHER! Swap!

Step 3: Now inserting "fox", tries [5], distance would be 2
        "bat" has distance 3
        → "fox" is RICHER! Swap!

Step 4: Now inserting "bat", tries [6], distance would be 3
        Empty! Insert here.

Result:
[2]: ("dog", 20)  dist: 0
[3]: ("elk", 40)  dist: 0  ← took ideal spot!
[4]: ("cat", 10)  dist: 2  ← pushed out
[5]: ("fox", 50)  dist: 3  ← pushed out  
[6]: ("bat", 30)  dist: 4  ← pushed out

The maximum distance decreased from 3 to 4, but now it’s more evenly spread!

📏 3.4. Load Factor

Load Factor = len / capacity

let load_factor = self.len as f64 / self.capacity as f64;

Why It Matters

Load = 0.5 (50%) → Plenty of empty slots, fast lookups
Load = 0.75 (75%) → Getting crowded, some probing
Load = 0.9 (90%) → Very crowded, long probe sequences
Load = 1.0 (100%) → FULL! Insertion impossible!

The Sweet Spot

Most HashMaps use 0.75 (75%) as the resize threshold:

fn should_grow(&self) -> bool {
    self.len * 4 >= self.capacity * 3  // len/cap >= 0.75
}

Why 0.75?

Performance still good
Memory not too wasted
Industry standard (Java, Python, etc.)

🔄 3.5. Resizing and Rehashing

When the table gets too full, we must grow it:

fn grow(&mut self) {
    let new_capacity = self.capacity * 2;
    let mut new_map = Self::with_capacity_and_hasher(
        new_capacity,
        self.hasher_builder.clone()
    );
    
    // Rehash all existing entries
    for entry in self.entries.drain(..) {
        if let Some(Entry { key, value, .. }) = entry {
            new_map.insert(key, value);
        }
    }
    
    // Replace self with new map
    *self = new_map;
}

The Rehashing Problem

Every entry must be reinserted because:

Old capacity: 8
hash("dog") % 8 = 2

New capacity: 16  
hash("dog") % 16 = 10  ← Different index!

The modulo operation changes, so all indices change!

⚡ 3.6. Optimizing Resize

Strategy 1: In-Place Resize (Tricky!)

// Allocate new array
// Move entries one by one
// But be careful with ownership!

Problem: Complex, error-prone, not much faster.

Strategy 2: Allocate-and-Swap (Simple!)

fn grow(&mut self) {
    // Create new map
    let mut new_map = Self::with_capacity(self.capacity * 2);
    
    // Move all entries (no recomputing hashes!)
    for entry in self.entries.drain(..) {
        if let Some(entry) = entry {
            // We already have the hash!
            new_map.insert_entry(entry.hash, entry.key, entry.value);
        }
    }
    
    *self = new_map;
}

Why this works:

We stored the hash in each entry
No need to recompute
Just reinsert at new indices

📐 3.7. Growth Strategy

When to grow:

Load factor > 0.75

How much to grow:

new_capacity = max(old_capacity * 2, 16)

Amortized Analysis

Resizing is O(n), but it happens rarely:

Insertions: 1, 2, 3, ..., 12, 13, ..., 24, 25, ...
Resizes:               ^resize      ^resize

Amortized cost per insert: O(1)

Because we double each time:

Copy n items during resize
But get n more insertions before next resize
Cost “spreads out” over many insertions

🧮 3.8. Handling Updates

What if the key already exists?

map.insert("apple", 10);
map.insert("apple", 20);  // Update!

Our insert logic already handles this:

if existing.key == key {
    // Same key! Just update the value
    return Some(std::mem::replace(&mut existing.value, value));
}

Returns: The old value (if any).

🧩 3.9. Edge Cases

Empty Map

capacity: 0
insert("x", 1) → grow to 16, then insert

Single-Element Map

capacity: 16, len: 1
load: 1/16 = 0.0625 (no resize needed)

Exactly at Threshold

capacity: 16, len: 12
load: 12/16 = 0.75 (triggers resize!)

🧠 3.10. Chapter Takeaways

Concept	Purpose
Robin Hood swap	Steal spots from “rich” entries
Load factor	Measure of how full the table is
Rehashing	All entries move when capacity changes
Amortized O(1)	Average cost per insert despite occasional resize
Store hash	Avoid recomputing during resize

🚀 3.11. What’s Next

We can insert and grow.

But what about removing entries?

Deletion is trickier than it looks — you can’t just set a bucket to Empty, or you’ll break the probe chain!

Next chapter: Deletion strategies and tombstones.

💡 Exercises

Implement insert without Robin Hood (plain linear probing)
Measure max probe distance as load factor increases
Implement resize and verify all keys are still findable
Test inserting duplicate keys returns old values
Benchmark resize overhead vs per-insert cost

Next: Chapter 4 — Delete and Tombstones 🗑️