Quantum Message

Eh4x CTFby smothy

Quantum Message — Forensics (304 pts)

Challenge Description

my quantum physics professor was teaching us about 1-D harmonic oscillator, he gave me a problem in which the angular frequency was 57 1035 (weird number) and it asked to calculate the energy eigenvalues, then he called someone, who did he call?

Hint 1: he types very fast like 20ms

Attachment: challenge.wav

Solution

Step 1: Initial Audio Analysis

The WAV file is mono, 44100 Hz, ~81.6 seconds long. Examining the structure reveals 80 distinct 1-second tones, each separated by a brief ~20ms silence.

python
from scipy.io import wavfile
import numpy as np

sr, audio = wavfile.read('challenge.wav')
# 3,598,560 samples, ~81.6 seconds, mono float32

Step 2: Identifying the Encoding — Quantum Energy Eigenvalues

The challenge references the 1-D quantum harmonic oscillator, whose energy eigenvalues are:

$$E_n = \hbar\omega\left(n + \frac{1}{2}\right)$$

With ω = 57 × 10³⁵ rad/s and ℏ = 1.0546 × 10⁻³⁴ J·s:

$$E_0 = \hbar\omega \times \frac{1}{2} \approx 300 \text{ (in arbitrary units)}$$

This gives eigenvalues proportional to: 300, 900, 1500, 2100, 2700, 3300, 3900 — corresponding to n = 0 through n = 6.

An FFT on each 1-second segment reveals that every tone contains two simultaneous frequencies from this set:

SegmentLow Freq (Hz)High Freq (Hz)
09023907
115033907
215032705
.........

Step 3: DTMF Phone Keypad Mapping

The dual-tone structure mirrors DTMF (Dual-Tone Multi-Frequency) signaling used in telephone keypads. The 7 frequencies split into:

  • Low group (rows): 301, 902, 1503, 2104 Hz → quantum numbers n = 0, 1, 2, 3
  • High group (columns): 2705, 3306, 3907 Hz → quantum numbers n = 4, 5, 6

This maps to the standard phone keypad:

2705 Hz 3306 Hz 3907 Hz 301 Hz: 1 2 3 902 Hz: 4 5 6 1503 Hz: 7 8 9 2104 Hz: * 0 #

Step 4: Extracting Phone Digits

python
keypad = {
    (0,4): '1', (0,5): '2', (0,6): '3',
    (1,4): '4', (1,5): '5', (1,6): '6',
    (2,4): '7', (2,5): '8', (2,6): '9',
    (3,4): '*', (3,5): '0', (3,6): '#',
}

Decoding all 80 segments produces the digit string:

69725288123113117521101161171099511210412111549995395495395534895539952114121125

Step 5: Variable-Length ASCII Decoding

The digits encode ASCII character codes directly using a variable-length scheme:

  • 2 digits for ASCII 32–99 (uppercase letters, digits, common symbols)
  • 3 digits for ASCII 100–125 (lowercase letters, {, })

The rule is simple: if the leading digit is 1, read 3 digits (since 10–19 are non-printable control characters). Otherwise, read 2 digits.

69 → E 72 → H 52 → 4 88 → X 123 → { 113 → q 117 → u 52 → 4 110 → n 116 → t 117 → u 109 → m 95 → _ 112 → p 104 → h 121 → y 115 → s 49 → 1 99 → c 53 → 5 95 → _ 49 → 1 53 → 5 95 → _ 53 → 5 48 → 0 95 → _ 53 → 5 99 → c 52 → 4 114 → r 121 → y 125 → }

Flag

EH4X{qu4ntum_phys1c5_15_50_5c4ry}

Solve Script

python
from scipy.io import wavfile
from scipy.signal import medfilt
import numpy as np

sr, audio = wavfile.read('challenge.wav')

# Detect 1-second tone segments via amplitude envelope
envelope = medfilt(np.abs(audio), 441)
active = envelope > 0.01
transitions = np.diff(active.astype(int))
starts = np.where(transitions == 1)[0] + 1
ends = np.where(transitions == -1)[0] + 1
if active[0]: starts = np.insert(starts, 0, 0)
if active[-1]: ends = np.append(ends, len(audio))
segments = list(zip(starts, ends))

# Frequency-to-quantum-number lookup (tolerant of ±5 Hz)
def freq_to_n(f):
    for target, n in [(301,0),(902,1),(1503,2),(2104,3),(2705,4),(3306,5),(3907,6)]:
        if abs(f - target) < 10:
            return n
    return -1

# Phone keypad: (row, col) -> digit
keypad = {
    (0,4):'1',(0,5):'2',(0,6):'3',
    (1,4):'4',(1,5):'5',(1,6):'6',
    (2,4):'7',(2,5):'8',(2,6):'9',
    (3,4):'*',(3,5):'0',(3,6):'#',
}

# Extract dual-tone pairs and map to phone digits
digits = []
for s, e in segments:
    chunk = audio[s:e]
    fft = np.abs(np.fft.rfft(chunk))
    freqs = np.fft.rfftfreq(len(chunk), 1/sr)
    sorted_idx = np.argsort(fft)[::-1]
    peaks = []
    for idx in sorted_idx:
        f = round(freqs[idx])
        if f < 100: continue
        if all(abs(f - p) > 500 for p in peaks):
            peaks.append(f)
        if len(peaks) == 2: break
    peaks.sort()
    low_n = freq_to_n(peaks[0])
    high_n = freq_to_n(peaks[1])
    digits.append(keypad.get((low_n, high_n), '?'))

digit_str = ''.join(digits)

# Variable-length ASCII decode: leading '1' → 3 digits, else 2 digits
result, pos = '', 0
while pos < len(digit_str):
    if digit_str[pos] == '1' and pos + 2 < len(digit_str):
        val = int(digit_str[pos:pos+3])
        if 100 <= val <= 127:
            result += chr(val)
            pos += 3
            continue
    val = int(digit_str[pos:pos+2])
    result += chr(val)
    pos += 2

print(result)
# EH4X{qu4ntum_phys1c5_15_50_5c4ry}

Key Takeaways

  • The "angular frequency 57 × 10³⁵" wasn't random — it produces E₀ ≈ 300 Hz, making the eigenvalues match the audio frequencies exactly.
  • The "20ms" hint pointed to the gap duration between tones, confirming the segmentation approach.
  • Recognizing the dual-tone structure (two simultaneous frequencies per segment) was crucial — a single-frequency analysis would miss half the data and produce gibberish.