Context Management

The Problem

Large language models have finite context windows. A deep learning research project generates enormous amounts of information — training logs, paper text, code, literature notes, experiment results. A long orchestration session will inevitably fill its context window.

When context fills up, the session cannot process new input. This is not a bug — it's an expected operating condition that AutoResearch is designed to handle gracefully.

The Solution: Don't Rely on Context for State

AutoResearch's approach is simple and robust:

Context is working memory. Disk is long-term memory. Never store critical state only in context.

Every decision, every result, every plan is written to .omc/research/ on disk. Context holds only what's needed for the current task plus enough background to make good decisions.

Prevention: Compact Agent Output

Agents are designed to return minimal information to the Orchestrator's context.

One-Line Summaries

When an agent finishes a task, it returns a one-line summary to the Orchestrator — not the full output.

# What Coder returns to Orchestrator context:
"Implementation complete. 3 files modified, all tests pass. Details in experiments/exp-001/config.yaml"

# NOT this:
"I implemented the linear attention mechanism by first creating a new module in models/attention.py 
that uses the RetNet recurrence formulation. The key insight was to decompose the attention matrix 
into... [500 lines of reasoning]"

The full output goes to disk. The Orchestrator gets just enough to decide what to do next.

Structured Disk Output

All detailed information is written as structured files:

yaml

# experiments/exp-001/results.yaml (on disk, not in context)
experiment: exp-001
method: linear_retnet_attention
metrics:
  perplexity: 18.7
  throughput: 12400 tok/s
  memory_peak_gb: 14.2
baselines:
  vanilla_attention:
    perplexity: 17.9
    throughput: 8200 tok/s
comparison: "+1.2 ppl, +51% throughput"
verdict: "Promising. Perplexity gap is small, throughput gain is significant."

Recovery: Rebuild from Disk

When a session ends (context full, crash, or intentional restart), the new session rebuilds its working context from disk.

mermaid

graph TD
    A[New Session Starts] --> B[Read pipeline.yaml]
    B --> C{Current Stage?}
    C -->|Ideation| D[Load ideas/selected.yaml]
    C -->|Training| E[Load experiments/exp-*/config.yaml<br/>+ latest log tail]
    C -->|Writing| F[Load papers/outline.md<br/>+ current section status]
    D --> G[Resume Work]
    E --> G
    F --> G

    style A fill:#dbeafe,stroke:#2563eb
    style G fill:#dcfce7,stroke:#16a34a

The rebuild process loads only what's relevant to the current stage — not the entire project history. This keeps the fresh context lean.

How to restart

Just start a new Claude Code session in the same project directory. The Orchestrator reads .omc/research/pipeline.yaml and picks up where it left off. No special recovery command needed.

Sub-Agent Context Isolation

Each agent runs in its own context, isolated from others. This is a feature, not a limitation.

Agent	Context Contains	Context Does NOT Contain
Orchestrator	Current stage, task summaries, pipeline state	Full training logs, full paper text
Planner	Research question, constraints, selected idea	Code, training results
Writer	Paper outline, current section, curated references	Code, raw experiment logs, orchestration history
Scout	Search query, relevance criteria	Code, experiment details
Coder	Design spec, current task, error messages	Paper text, literature notes
Judge	Artifact to evaluate, evaluation criteria	Creation history, agent conversations

Why isolation matters

Context isolation prevents information contamination. The Judge doesn't know what the Coder struggled with — it evaluates the code on its merits. The Writer doesn't see messy experiment debugging — it gets clean, curated results.

Writer's Clean Context

The Writer agent deserves special attention because paper quality depends on it receiving the right information in the right form.

What the Writer Receives

Paper outline with section assignments
Curated experiment results (cleaned, formatted)
Selected related work summaries (pre-digested by Scout)
Figure descriptions and data
Style guidelines and venue requirements

What the Writer Never Receives

Raw training logs
Debugging transcripts
Failed experiment details (unless relevant to discussion)
Orchestrator decision history
Other agents' internal reasoning

# The Orchestrator curates context for the Writer:

Writer, please draft Section 4.2 (Ablation Study).

Context:
- Outline: papers/outline.md (Section 4.2 spec)
- Results: experiments/summary.yaml (ablation rows only)
- Baseline comparison: experiments/exp-001/analysis.md

Do NOT include information about implementation difficulties 
or failed configurations unless they inform the analysis.

One session per major section

The Writer starts a clean session for each major section. This prevents context from filling up mid-paper and ensures each section gets maximum context budget for quality writing.

Strategies Summary

Strategy	Purpose	Mechanism
One-line summaries	Prevent context bloat	Agents return minimal info to Orchestrator
Structured disk output	Enable recovery	All details persisted as YAML/Markdown
Stage-aware rebuild	Fast context init	New sessions load only current-stage data
Context isolation	Prevent contamination	Each agent sees only what it needs
Clean Writer sessions	Maximize writing quality	Fresh context per section

Monitoring — how training is watched without filling context
Research State — the disk structures that enable all of this

Context Management ​

The Problem ​

The Solution: Don't Rely on Context for State ​

Prevention: Compact Agent Output ​

One-Line Summaries ​

Structured Disk Output ​

Recovery: Rebuild from Disk ​

Sub-Agent Context Isolation ​

Writer's Clean Context ​

What the Writer Receives ​

What the Writer Never Receives ​

Strategies Summary ​

Next ​