Delete and Tombstones


title: “HashMap Deletion and Tombstones in Rust” meta_description: “Learn how to implement deletion in open-addressed HashMaps using tombstones and backward shift deletion in Rust.” keywords: [“hashmap deletion”, “tombstones”, “backward shift”, “open addressing deletion”, “rust hashmap remove”]

🗑️ Reinventing from Scratch: Building HashMap

Chapter 4 — “Delete and Tombstones”

“Deleting is easy. Deleting correctly without breaking future lookups? That’s an art.”


🧭 4.1. The Deletion Problem

Let’s say we have this table:

[2]: ("dog", 20)  ← hash→2, ideal spot
[3]: ("cat", 10)  ← hash→2, probed from [2]
[4]: ("fox", 50)  ← hash→2, probed from [3]

Now we delete “cat”:

map.remove("cat");

Naive approach: Just set [3] to Empty:

[2]: ("dog", 20)
[3]: Empty        ← ❌ DANGER!
[4]: ("fox", 50)

Problem: Now when we search for “fox”:

// hash("fox") % capacity = 2
// Check [2]: "dog" != "fox", continue
// Check [3]: Empty ← STOP! "fox" not found!

We broke the probe chain! “fox” is now invisible!


💀 4.2. Solution 1: Tombstones

Instead of Empty, mark it as Tombstone (deleted but keep probing):

enum Slot<K, V> {
    Empty,
    Occupied(Entry<K, V>),
    Tombstone,  // Was occupied, now deleted
}

After deleting “cat”:

[2]: ("dog", 20)
[3]: Tombstone    ← Keep probing past this!
[4]: ("fox", 50)

Now searching for “fox”:

// Check [2]: "dog" != "fox", continue
// Check [3]: Tombstone → keep probing!
// Check [4]: "fox" ← Found it!

🔍 4.3. Implementing Remove with Tombstones

pub fn remove(&mut self, key: &K) -> Option<V> {
    let hash = self.hash_key(key);
    let mut index = self.ideal_index(hash);
    let mut distance = 0;
    
    loop {
        match &self.entries[index] {
            Empty => {
                // Not found
                return None;
            }
            
            Tombstone => {
                // Keep searching!
                index = (index + 1) % self.capacity();
                distance += 1;
                continue;
            }
            
            Occupied(entry) if &entry.key == key => {
                // Found it! Mark as tombstone
                let value = match std::mem::replace(&mut self.entries[index], Tombstone) {
                    Occupied(entry) => entry.value,
                    _ => unreachable!(),
                };
                
                self.len -= 1;
                self.tombstone_count += 1;
                
                return Some(value);
            }
            
            Occupied(_) => {
                // Different key, keep probing
                index = (index + 1) % self.capacity();
                distance += 1;
            }
        }
    }
}

🧹 4.4. The Tombstone Problem

Tombstones accumulate:

After many inserts and deletes:
[T, _, T, X, T, T, X, _]  // 4 tombstones!

Issues:

  1. Slower lookups — Must probe past tombstones
  2. Wasted space — Tombstones count toward load factor
  3. Never reclaimed — Table grows but doesn’t shrink

Solution: Rebuild When Tombstone Count is High

fn should_rebuild(&self) -> bool {
    // Rebuild if > 25% of capacity is tombstones
    self.tombstone_count * 4 > self.capacity
}

pub fn remove(&mut self, key: &K) -> Option<V> {
    let result = self.remove_internal(key);
    
    if self.should_rebuild() {
        self.rebuild();
    }
    
    result
}

fn rebuild(&mut self) {
    // Rehash into same capacity (just removes tombstones)
    let mut new_map = Self::with_capacity(self.capacity);
    
    for entry in self.entries.drain(..) {
        if let Occupied(e) = entry {
            new_map.insert(e.key, e.value);
        }
    }
    
    *self = new_map;
}

🔄 4.5. Solution 2: Backward Shift Deletion

Instead of tombstones, shift entries backward to fill the gap:

Before delete "cat":
[2]: ("dog", 20)  dist: 0
[3]: ("cat", 10)  dist: 1  ← delete this
[4]: ("fox", 50)  dist: 2

After delete (with backward shift):
[2]: ("dog", 20)  dist: 0
[3]: ("fox", 50)  dist: 1  ← shifted back!
[4]: Empty

The Algorithm

fn remove_with_shift(&mut self, key: &K) -> Option<V> {
    // Find the entry
    let index = self.find_index(key)?;
    
    // Extract the value
    let value = match std::mem::replace(&mut self.entries[index], Empty) {
        Occupied(entry) => entry.value,
        _ => return None,
    };
    
    self.len -= 1;
    
    // Shift backward
    let mut current = index;
    loop {
        let next = (current + 1) % self.capacity();
        
        match &self.entries[next] {
            Empty | Tombstone => {
                // Nothing more to shift
                break;
            }
            
            Occupied(entry) => {
                // Check if this entry can be shifted back
                let ideal = self.ideal_index(entry.hash);
                
                if ideal <= current || ideal > next {
                    // Can shift back!
                    self.entries[current] = std::mem::replace(&mut self.entries[next], Empty);
                    current = next;
                } else {
                    // Can't shift, stop
                    break;
                }
            }
        }
    }
    
    Some(value)
}

⚖️ 4.6. Tombstones vs Backward Shift

Approach Pros Cons
Tombstones Simple, fast delete Need rebuilds, slower lookups
Backward Shift No tombstones, faster lookups Complex, slower delete

When to Use Which?

  • Many deletes: Backward shift
  • Few deletes: Tombstones (simpler)
  • Read-heavy: Backward shift (better lookup)
  • Write-heavy: Tombstones (faster insert/delete)

We’ll use tombstones for our implementation (simpler, easier to understand).


🧪 4.7. Testing Deletion

Critical test cases:

#[test]
fn test_delete_doesnt_break_probe_chain() {
    let mut map = MyHashMap::new();
    
    // Insert keys that collide
    map.insert("a", 1);  // hash→5
    map.insert("k", 2);  // hash→5, probes to [6]
    map.insert("u", 3);  // hash→5, probes to [7]
    
    // Delete middle one
    map.remove("k");
    
    // Must still find "u"!
    assert_eq!(map.get("u"), Some(&3));
}

🧠 4.8. Chapter Takeaways

Concept Meaning
Probe chain Sequence of probes during lookup
Tombstone Marker for deleted entry, keep probing
Backward shift Alternative to tombstones, shifts entries back
Rebuild Rehash to remove tombstones
Load factor threshold When to trigger grow/rebuild

🚧 4.9. What’s Next

We can now:

  • Insert entries with Robin Hood
  • Delete entries without breaking lookups
  • Grow the table when needed

Next: Chapter 5 — Iteration and Views

Learn how to:

  • Iterate over all entries safely
  • Create Keys/Values iterators
  • Handle iterator invalidation

💡 Exercises

  1. Implement both tombstone and backward-shift deletion
  2. Measure performance of each under heavy delete workload
  3. Test edge case: Delete all entries, verify table is empty
  4. Implement contains_key using the same probing logic
  5. Add rebuild triggered by tombstone threshold

Next: Chapter 5 — Iteration and Views 🔄