lazy-bigrams - LA CTF 2025 Crypto Writeup
Category: Crypto
Difficulty: Medium
Flag: lactf{n0t_r34lly_4_b1gr4m_su8st1tu7ion_bu7_1_w1ll_tak3_1t_f0r_n0w}
Solved by: Smothy @ 0xN1umb

"When you NATO phonetic alphabet so hard you NATO phonetic alphabet your NATO phonetic alphabet"
Challenge Description
A simple bigram substitution cipher... with a twist.
We get chall.py (the encryption script) and ct.txt (2158 characters of uppercase ciphertext). The encryption pipeline is delightfully cursed: take the flag, convert each character to its NATO phonetic word, then do it AGAIN on the result, then apply a random bigram substitution cipher. Double phonetic encoding go brrr.
TL;DR
Double phonetic mapping + bigram substitution. The known flag prefix lactf{ expands to 215 characters of known plaintext through the double-phonetic pipeline, giving us 61 substitution table entries for free. Backtracking through possible flag characters with constraint propagation cracks the rest in under a second. Crypto? More like known-plaintext-attack speedrun.
Initial Recon
Two files: the encryption script and the ciphertext output.
The ciphertext is 2158 uppercase characters - all A-Z, no spaces, no numbers. Clean bigram substitution output. Let's read the source.
Step 1: Understanding the Encryption Pipeline
The encryption has three stages:
def phonetic_mapping(ptext):
cleanptext = re.sub(r'[^a-zA-Z0-9_{}]', '', ptext).upper()
mapped = "".join([phonetic_map[c] for c in cleanptext])
if (len(mapped) % 2 == 1):
mapped += "X" # Pad to even length
return mapped
pt = phonetic_mapping(phonetic_mapping(flag)) # DOUBLE phonetic!
ct = encryption(pt) # Random bigram substitutionSo each flag character goes through:
- First phonetic:
a->ALPHA,_->UNDERSCORE,{->OPENCURLYBRACE, etc. - Second phonetic: Each letter of the first result gets phonetic'd again:
A->ALPHA,L->LIMA, etc. - Bigram substitution: Pairs of characters are randomly permuted (676 possible bigrams AA-ZZ, shuffled).
The key observation: this is a substitution cipher with massive known plaintext from the flag format.
Step 2: Quantifying the Known Plaintext
Each flag character expands to a LOT of characters through double-phonetic:
The expansion sizes range from 15 chars (6 -> SIX -> 15) to 56 chars (_ -> UNDERSCORE -> 56). For the homies who haven't done the math: a 6-character prefix lactf{ blows up to 215 characters of known plaintext. That's nearly 10% of the entire ciphertext, for free!
From just the prefix, we recover 61 unique bigram substitution entries. Out of 676 total possible bigrams, that's already a solid foundation.
Step 3: The Carry Problem
Here's where it gets spicy. The prefix expansion is 215 characters - that's odd. Bigram substitution works on pairs (positions 0-1, 2-3, 4-5...), so the 215th character of the prefix is the first half of a bigram whose second half comes from the first unknown flag character.
PT: ...ECHO | ??? (first unknown char expansion)
^---^
This bigram crosses the boundary!
We need to track this "carry" character as we try different flag characters. If a character's expansion starts with A and our carry is O, then the boundary bigram is OA, which must be consistent with the ciphertext at that position.
Step 4: Padding Cases
The phonetic_mapping function pads with X if the result is odd length. This happens at both levels:
- First-level padding: Adds
Xat end of first phonetic result, which then expands toXRAY->XRAYROMEOALPHAYANKEE(20 chars) at second level - Second-level padding: Adds a single
Xat the very end
Four combinations to try. The winning case turned out to be no first-level padding, yes second-level padding, giving us a middle target of 1861 characters.
Step 5: Backtracking Solver
The solver is elegant in its brutality:
def bt_with_carry(ct, pt_pos_start, remaining, carry, sub_map, rev_sub, path, depth):
if remaining == 0:
return path
if remaining < 0 or remaining < 15: # Min expansion is 15 (for '6')
return None
for c in "abcdefghijklmnopqrstuvwxyz0123456789_":
expansion = char_to_double[c]
full_seq = carry + expansion
# Check every bigram pair against known substitution entries
# Both forward (PT->CT) and reverse (CT->PT) must be consistent
consistent = check_consistency(full_seq, ct, pt_pos_start, sub_map, rev_sub)
if consistent:
# Add new substitution entries, recurse
result = bt_with_carry(ct, pt_pos_start + len(expansion),
remaining - len(expansion), new_carry, ...)
if result:
return [c] + result
# Rollback on failureThe magic is constraint propagation: each correctly placed flag character reveals new substitution table entries that constrain future characters. By the time we're a few characters in, most wrong guesses are eliminated instantly. The whole thing runs in under a second.
The Flag
lactf{n0t_r34lly_4_b1gr4m_su8st1tu7ion_bu7_1_w1ll_tak3_1t_f0r_n0w}
Decoded: "not really a bigram substitution but I will take it for now"
Fair point, challenge author. A bigram substitution on double-phonetic-encoded text is indeed not really a traditional bigram cipher. But we'll take the flag, thanks.
The Solve Script
"""
lazy-bigrams solver - LA CTF 2025
Smothy @ 0xN1umb
"""
import re
phonetic_map = {
"A":"ALPHA","B":"BRAVO","C":"CHARLIE","D":"DELTA","E":"ECHO",
"F":"FOXTROT","G":"GOLF","H":"HOTEL","I":"INDIA","J":"JULIETT",
"K":"KILO","L":"LIMA","M":"MIKE","N":"NOVEMBER","O":"OSCAR",
"P":"PAPA","Q":"QUEBEC","R":"ROMEO","S":"SIERRA","T":"TANGO",
"U":"UNIFORM","V":"VICTOR","W":"WHISKEY","X":"XRAY","Y":"YANKEE",
"Z":"ZULU","_":"UNDERSCORE","{":"OPENCURLYBRACE","}":"CLOSECURLYBRACE",
"0":"ZERO","1":"ONE","2":"TWO","3":"THREE","4":"FOUR","5":"FIVE",
"6":"SIX","7":"SEVEN","8":"EIGHT","9":"NINE"
}
# Precompute double-phonetic expansion for each flag character
possible = "abcdefghijklmnopqrstuvwxyz0123456789_"
char_to_double = {}
for c in list(possible) + ["{", "}"]:
first = phonetic_map[c.upper()]
second = "".join(phonetic_map[ch] for ch in first)
char_to_double[c] = second
with open("ct.txt") as f:
ct = f.read().strip()
# Build substitution table from known prefix "lactf{"
prefix_exp = "".join(char_to_double[c] for c in "lactf{")
suffix_exp = char_to_double["}"]
sub_map, rev_sub = {}, {}
for i in range(0, len(prefix_exp) - 1, 2):
pb, cb = prefix_exp[i:i+2], ct[i:i+2]
sub_map[pb] = cb
rev_sub[cb] = pb
# Add suffix mappings (second-level padding case)
suffix_start = len(ct) - len(suffix_exp) - 1 # -1 for second-level X pad
for i in range(0, len(suffix_exp) - 1, 2):
pb, cb = suffix_exp[i:i+2], ct[suffix_start+i:suffix_start+i+2]
sub_map.setdefault(pb, cb)
rev_sub.setdefault(cb, pb)
def bt(pt_pos, remaining, carry, path):
if remaining == 0: return path
if remaining < 15: return None
for c in possible:
exp = char_to_double[c]
if len(exp) > remaining: continue
if remaining - len(exp) > 0 and remaining - len(exp) < 15: continue
full = carry + exp
ok, pairs = True, []
for j in range(0, len(full)-1, 2):
pb, cb = full[j:j+2], ct[pt_pos-len(carry)+j:pt_pos-len(carry)+j+2]
if pb in sub_map and sub_map[pb] != cb: ok = False; break
if cb in rev_sub and rev_sub[cb] != pb: ok = False; break
if pb not in sub_map and cb not in rev_sub: pairs.append((pb,cb))
if not ok: continue
added = []
for pb, cb in pairs:
if pb in sub_map or cb in rev_sub: continue
sub_map[pb]=cb; rev_sub[cb]=pb; added.append((pb,cb))
nc = full[-1] if len(full)%2 else ""
r = bt(pt_pos+len(exp), remaining-len(exp), nc, path+[c])
if r: return r
for pb,cb in added: del sub_map[pb]; del rev_sub[cb]
return None
middle = len(ct) - len(prefix_exp) - len(suffix_exp) - 1
carry = prefix_exp[-1] if len(prefix_exp)%2 else ""
result = bt(len(prefix_exp), middle, carry, [])
print(f"lactf{{''.join(result)}}")The Graveyard of Failed Attempts
-
First attempt: forgot about carry characters. The prefix expansion is 215 chars (odd!), so the boundary bigram between prefix and middle crosses the known/unknown boundary. Assertion errors everywhere until I realized I needed to track partial bigrams.
-
Wrong padding assumption. Initially tried "no padding at all" - the suffix mapping came back with conflicts. Had to systematically try all 4 padding combinations (first-level pad x second-level pad).
-
Greedy solver that tried to parse phonetic words. Tried to parse the partially-decrypted text into phonetic words and use structure to fill gaps. Way overengineered. The backtracking approach was cleaner and faster.
Key Takeaways
-
Known plaintext is king. Even 6 known characters of flag format gave us 215 chars (9.96%) of known plaintext. Always compute how much you can derive from format strings.
-
Double encoding amplifies known plaintext. Each layer of phonetic mapping multiplies the known prefix length by roughly 5x. Two layers = 25x amplification. Beautiful.
-
Bigram substitution is NOT strong. With ~60 known bigram pairs out of 676, constraint propagation eliminates most possibilities. The search space collapses fast.
-
Handle boundary conditions carefully. The odd-length prefix creating a "carry" character was the trickiest part. Always check for off-by-one issues in cipher implementations.
-
Try all padding combinations. When there are multiple possible states (padding/no padding), enumerate them systematically rather than guessing.
Tools Used
- Python 3 (solver + matplotlib for screenshots)
- Brain cells (like 3 of them)
- Way too much caffeine
Writeup by Smothy from 0xN1umb team. Double phonetic? More like double trouble. GG.