Coder

The Coder is the implementation workhorse. It writes code, runs experiments, debugs errors, and produces results — all within the boundaries set by the Planner's design.

Identity

Property	Value
LLM	Codex (GPT)
Invocation	`omc team 1:codex:coder "task"`
Lifecycle	Persistent — stays alive across multiple tasks
Session	Named tmux session (`{prefix}-coder`)

Why persistent?

Unlike the Writer (fresh session per section) or the Judge (stateless per invocation), the Coder keeps its tmux session alive across tasks. This preserves the development environment — conda env, working directory, running processes, terminal history. The Coder can resume exactly where it left off.

Responsibilities

Responsibility	Description
Code implementation	Translate Planner's design into working code
Environment setup	Install dependencies, configure GPU, prepare data
Training execution	Launch and monitor training runs
Debugging	Fix errors in code, data pipeline, or training
Testing	Run tests to verify implementation correctness
Result extraction	Parse logs and produce structured results files
Figure generation	Create plots from Scout's figure descriptions
Experiment management	Track configs, logs, and checkpoints per experiment

Task Flow

mermaid

graph TD
    O[Orchestrator] -->|"task + spec"| C[Coder]
    C --> I{Task Type}
    I -->|implement| Code[Write Code]
    I -->|train| Train[Launch Training]
    I -->|debug| Debug[Fix Error]
    I -->|extract| Extract[Parse Results]
    
    Code --> T[Run Tests]
    T -->|pass| R[Report to Orchestrator]
    T -->|fail| Debug
    
    Train --> M[Monitor Start]
    M -->|healthy| R
    M -->|error| Debug
    
    Debug --> T2{Fixed?}
    T2 -->|yes| R
    T2 -->|no, retry < 3| Debug
    T2 -->|no, retry >= 3| E[Escalate to Orchestrator]
    
    Extract --> R
    
    style O fill:#f9f0ff,stroke:#7c3aed
    style C fill:#fef3c7,stroke:#d97706
    style E fill:#fee2e2,stroke:#dc2626
    style R fill:#dcfce7,stroke:#16a34a

Error Handling: The Ralph Self-Fix Loop

When the Coder encounters an error, it enters a ralph loop — a tight fix-test-retry cycle.

Step 1: Coder runs code → Error: "RuntimeError: CUDA out of memory"
Step 2: Coder analyzes error → Reduces batch size in config
Step 3: Coder runs code → Error: "AssertionError: unexpected shape [64, 512]"
Step 4: Coder analyzes error → Fixes tensor reshape
Step 5: Coder runs code → Tests pass ✓
Step 6: Coder reports success to Orchestrator

Ralph Rules

Rule	Detail
Max retries	3 attempts per error type
Scope	Fix the immediate error only — no refactoring
Escalation	After 3 failures, stop and report to Orchestrator
Logging	Each attempt logged to `logs/errors.log`

Ralph fixes, it doesn't redesign

The ralph loop is for fixing implementation errors — bugs, shape mismatches, OOM. If the Coder encounters a design problem (e.g., "this algorithm can't work because X"), it must escalate to the Orchestrator immediately. The Coder does not redesign experiments.

Does Not Make Design Decisions

This is the Coder's most important constraint:

The Coder Does	The Coder Does NOT
Implement the specified architecture	Choose which architecture to implement
Set hyperparameters per spec	Decide which hyperparameters to try
Run the specified experiments	Decide which experiments to run
Fix bugs in code	Change the experimental design
Report unexpected results	Interpret what unexpected results mean
Optimize code for speed	Change the algorithm for speed

The Coder is an expert implementer, not a researcher

Think of the Coder as a highly skilled research engineer. You hand it a specification and it builds exactly that — efficiently, correctly, and reliably. It doesn't second-guess the research direction. If something seems wrong with the design, it reports the observation and lets the Orchestrator (the PI) decide.

Example: Design vs. Implementation

# This is an implementation fix (Coder handles it):
"RuntimeError: shape mismatch in attention projection"
→ Fix the projection dimensions

# This is a design problem (Coder escalates):
"Training converges but perplexity is 30% worse than baseline"
→ Report to Orchestrator: "Results significantly below expected. 
   Perplexity: 25.1 vs expected 18.5. May need design revision."

Output

The Coder writes all output to disk:

Output	Location	Format
Code	`src/`	Python files
Configs	`experiments/exp-*/config.yaml`	YAML
Training logs	`experiments/exp-*/log.jsonl`	Structured JSONL
Results	`experiments/exp-*/results.yaml`	YAML
Figures	`papers/figures/*.pdf`	PDF plots
Error logs	`logs/errors.log`	Timestamped text

The Orchestrator receives a one-line summary:

"Experiment exp-003 training complete. Final perplexity: 18.7. 
 Results in experiments/exp-003/results.yaml"

Judge — who reviews the Coder's work
Planner — who writes the Coder's specifications
Implementation Stage — where the Coder is most active

Coder ​

Identity ​

Responsibilities ​

Task Flow ​

Error Handling: The Ralph Self-Fix Loop ​

Ralph Rules ​

Does Not Make Design Decisions ​

Example: Design vs. Implementation ​

Output ​

Next ​