Allocator and Layout

Reinventing From Scratch — Box<T>

Chapter 2 — The Heap, the Allocator, and Layout

2.1 The allocator API

Rust’s global allocator is exposed via std::alloc:

  • alloc(layout) → raw, uninitialized memory
  • dealloc(ptr, layout) → free memory
  • Layout::new::<T>() → size + alignment of T
use std::alloc::{alloc, dealloc, Layout};

unsafe fn alloc_one<T>() -> *mut T {
    let layout = Layout::new::<T>();
    let p = alloc(layout) as *mut T;
    if p.is_null() { panic!("allocation failed"); }
    p
}

unsafe fn dealloc_one<T>(ptr: *mut T) {
    dealloc(ptr.cast(), Layout::new::<T>());
}

2.2 The four commandments

  1. Initialize before read. Memory from alloc is garbage until you write.
  2. Drop before dealloc. Call the destructor first, then free bytes.
  3. Same layout both ways. Dealloc with the exact layout you allocated with.
  4. Alignment is real. Layout encodes alignment; ignore it → UB.

⚠️ UB trap: Reading an uninitialized T (even via safe references derived from bad pointers) is undefined behavior.

2.3 Visualizing the lifecycle

alloc ──▶ (uninit bytes) ──ptr::write(value)──▶ (initialized T)
   ▲                                              │
   └────────────── dealloc ◀── drop_in_place ◀────┘

Exercises

  1. Why must the deallocation Layout match the allocation Layout?
  2. What happens if you drop after dealloc? Describe the failure mode.

Deep Dive: Ownership Proofs, Drop Order, and DST Considerations

A. Formal Invariants for MyBox<T> (Sized)

  • B1 (Pointer Validity): ptr is either null only after into_raw or a valid, properly aligned pointer to initialized T.
  • B2 (Single Drop): The destructor of T is invoked exactly once if and only if ptr is non-null at Drop time.
  • B3 (Dealloc after Drop): dealloc(layout_of::<T>()) is called exactly once, and only after drop_in_place.
  • B4 (From/Into Raw Consistency): from_raw only accepts pointers produced by into_raw of the same type/allocator; mixing allocators is UB.
  • B5 (No References to Uninit): No &/&mut references are created before ptr::write initializes the allocation.

B. Proof Sketches

B.1 Single DropDrop checks for null and calls drop_in_place once; into_raw nulls out ptr and forgets self, preventing Drop from running on a live value.
B.2 No Use-After-Free — Deallocation happens only after the destructor; references returned by Deref are derived from a live ptr and never stored beyond the box’s lifetime.
B.3 Panic Safety — If constructor panics before publishing, no ownership is established; if Drop panics (should not), process aborts, avoiding double-unwind corruption.

C. DST Box Notes

  • Slices (Box<[T]>): store length; the fat pointer (data, len) enables correct deallocation.
  • Box<str>: same as [u8] with UTF‑8 invariant; length in metadata.
  • Box<dyn Trait>: fat pointer (data, vtable); the vtable encodes drop and size/alignment; std uses compiler magic for correct layout.

D. Interop Patterns

  • FFI Ownership Transfer: into_raw -> C takes ownership; C must call back into Rust with from_raw or a custom free.
  • Leaking Globals: leak returns 'static reference, acceptable for process lifetime singletons; document intent.

E. Debugging

  • Double Drop: look for *p assignment instead of ptr::write on uninitialized memory.
  • Mismatched Layout: using dealloc with wrong Layout causes heap corruption; keep Layout::new::<T>() paired.

F. Exercises

  1. Implement try_new returning Result<MyBox<T>, AllocError>.
  2. Add into_inner(self) -> T by ptr::read and skipping dealloc? Explain why you must still dealloc after moving T.
  3. Implement MyVec::into_boxed_slice that hands RawVec buffer to a Box<[T]> safely.

FAQ (Extended)

Q: Does Box<T> guarantee a stable address? A: Yes, the pointee’s address is stable for the life of the box; moving the box moves only the handle.
Q: Why ptr::write not *p = value? A: The latter reads/drops the previous contents (uninitialized), which is UB.
Q: Can Box<T> be null? A: By design, standard Box<T> is non-null; our MyBox may set ptr = null only as a consumed sentinel post-into_raw.
Q: Is Pin<Box<T>> needed for stable address? A: Not for stability; Pin is for forbidding moves via the API.