Context Management
The Problem
Large language models have finite context windows. A deep learning research project generates enormous amounts of information — training logs, paper text, code, literature notes, experiment results. A long orchestration session will inevitably fill its context window.
When context fills up, the session cannot process new input. This is not a bug — it's an expected operating condition that AutoResearch is designed to handle gracefully.
The Solution: Don't Rely on Context for State
AutoResearch's approach is simple and robust:
Context is working memory. Disk is long-term memory. Never store critical state only in context.
Every decision, every result, every plan is written to .omc/research/ on disk. Context holds only what's needed for the current task plus enough background to make good decisions.
Prevention: Compact Agent Output
Agents are designed to return minimal information to the Orchestrator's context.
One-Line Summaries
When an agent finishes a task, it returns a one-line summary to the Orchestrator — not the full output.
# What Coder returns to Orchestrator context:
"Implementation complete. 3 files modified, all tests pass. Details in experiments/exp-001/config.yaml"
# NOT this:
"I implemented the linear attention mechanism by first creating a new module in models/attention.py
that uses the RetNet recurrence formulation. The key insight was to decompose the attention matrix
into... [500 lines of reasoning]"The full output goes to disk. The Orchestrator gets just enough to decide what to do next.
Structured Disk Output
All detailed information is written as structured files:
# experiments/exp-001/results.yaml (on disk, not in context)
experiment: exp-001
method: linear_retnet_attention
metrics:
perplexity: 18.7
throughput: 12400 tok/s
memory_peak_gb: 14.2
baselines:
vanilla_attention:
perplexity: 17.9
throughput: 8200 tok/s
comparison: "+1.2 ppl, +51% throughput"
verdict: "Promising. Perplexity gap is small, throughput gain is significant."Recovery: Rebuild from Disk
When a session ends (context full, crash, or intentional restart), the new session rebuilds its working context from disk.
graph TD
A[New Session Starts] --> B[Read pipeline.yaml]
B --> C{Current Stage?}
C -->|Ideation| D[Load ideas/selected.yaml]
C -->|Training| E[Load experiments/exp-*/config.yaml<br/>+ latest log tail]
C -->|Writing| F[Load papers/outline.md<br/>+ current section status]
D --> G[Resume Work]
E --> G
F --> G
style A fill:#dbeafe,stroke:#2563eb
style G fill:#dcfce7,stroke:#16a34aThe rebuild process loads only what's relevant to the current stage — not the entire project history. This keeps the fresh context lean.
How to restart
Just start a new Claude Code session in the same project directory. The Orchestrator reads .omc/research/pipeline.yaml and picks up where it left off. No special recovery command needed.
Sub-Agent Context Isolation
Each agent runs in its own context, isolated from others. This is a feature, not a limitation.
| Agent | Context Contains | Context Does NOT Contain |
|---|---|---|
| Orchestrator | Current stage, task summaries, pipeline state | Full training logs, full paper text |
| Planner | Research question, constraints, selected idea | Code, training results |
| Writer | Paper outline, current section, curated references | Code, raw experiment logs, orchestration history |
| Scout | Search query, relevance criteria | Code, experiment details |
| Coder | Design spec, current task, error messages | Paper text, literature notes |
| Judge | Artifact to evaluate, evaluation criteria | Creation history, agent conversations |
Why isolation matters
Context isolation prevents information contamination. The Judge doesn't know what the Coder struggled with — it evaluates the code on its merits. The Writer doesn't see messy experiment debugging — it gets clean, curated results.
Writer's Clean Context
The Writer agent deserves special attention because paper quality depends on it receiving the right information in the right form.
What the Writer Receives
- Paper outline with section assignments
- Curated experiment results (cleaned, formatted)
- Selected related work summaries (pre-digested by Scout)
- Figure descriptions and data
- Style guidelines and venue requirements
What the Writer Never Receives
- Raw training logs
- Debugging transcripts
- Failed experiment details (unless relevant to discussion)
- Orchestrator decision history
- Other agents' internal reasoning
# The Orchestrator curates context for the Writer:
Writer, please draft Section 4.2 (Ablation Study).
Context:
- Outline: papers/outline.md (Section 4.2 spec)
- Results: experiments/summary.yaml (ablation rows only)
- Baseline comparison: experiments/exp-001/analysis.md
Do NOT include information about implementation difficulties
or failed configurations unless they inform the analysis.One session per major section
The Writer starts a clean session for each major section. This prevents context from filling up mid-paper and ensures each section gets maximum context budget for quality writing.
Strategies Summary
| Strategy | Purpose | Mechanism |
|---|---|---|
| One-line summaries | Prevent context bloat | Agents return minimal info to Orchestrator |
| Structured disk output | Enable recovery | All details persisted as YAML/Markdown |
| Stage-aware rebuild | Fast context init | New sessions load only current-stage data |
| Context isolation | Prevent contamination | Each agent sees only what it needs |
| Clean Writer sessions | Maximize writing quality | Fresh context per section |
Next
- Monitoring — how training is watched without filling context
- Research State — the disk structures that enable all of this