Skip to content

Context Management

The Problem

Large language models have finite context windows. A deep learning research project generates enormous amounts of information — training logs, paper text, code, literature notes, experiment results. A long orchestration session will inevitably fill its context window.

When context fills up, the session cannot process new input. This is not a bug — it's an expected operating condition that AutoResearch is designed to handle gracefully.

The Solution: Don't Rely on Context for State

AutoResearch's approach is simple and robust:

Context is working memory. Disk is long-term memory. Never store critical state only in context.

Every decision, every result, every plan is written to .omc/research/ on disk. Context holds only what's needed for the current task plus enough background to make good decisions.

Prevention: Compact Agent Output

Agents are designed to return minimal information to the Orchestrator's context.

One-Line Summaries

When an agent finishes a task, it returns a one-line summary to the Orchestrator — not the full output.

# What Coder returns to Orchestrator context:
"Implementation complete. 3 files modified, all tests pass. Details in experiments/exp-001/config.yaml"

# NOT this:
"I implemented the linear attention mechanism by first creating a new module in models/attention.py 
that uses the RetNet recurrence formulation. The key insight was to decompose the attention matrix 
into... [500 lines of reasoning]"

The full output goes to disk. The Orchestrator gets just enough to decide what to do next.

Structured Disk Output

All detailed information is written as structured files:

yaml
# experiments/exp-001/results.yaml (on disk, not in context)
experiment: exp-001
method: linear_retnet_attention
metrics:
  perplexity: 18.7
  throughput: 12400 tok/s
  memory_peak_gb: 14.2
baselines:
  vanilla_attention:
    perplexity: 17.9
    throughput: 8200 tok/s
comparison: "+1.2 ppl, +51% throughput"
verdict: "Promising. Perplexity gap is small, throughput gain is significant."

Recovery: Rebuild from Disk

When a session ends (context full, crash, or intentional restart), the new session rebuilds its working context from disk.

mermaid
graph TD
    A[New Session Starts] --> B[Read pipeline.yaml]
    B --> C{Current Stage?}
    C -->|Ideation| D[Load ideas/selected.yaml]
    C -->|Training| E[Load experiments/exp-*/config.yaml<br/>+ latest log tail]
    C -->|Writing| F[Load papers/outline.md<br/>+ current section status]
    D --> G[Resume Work]
    E --> G
    F --> G

    style A fill:#dbeafe,stroke:#2563eb
    style G fill:#dcfce7,stroke:#16a34a

The rebuild process loads only what's relevant to the current stage — not the entire project history. This keeps the fresh context lean.

How to restart

Just start a new Claude Code session in the same project directory. The Orchestrator reads .omc/research/pipeline.yaml and picks up where it left off. No special recovery command needed.

Sub-Agent Context Isolation

Each agent runs in its own context, isolated from others. This is a feature, not a limitation.

AgentContext ContainsContext Does NOT Contain
OrchestratorCurrent stage, task summaries, pipeline stateFull training logs, full paper text
PlannerResearch question, constraints, selected ideaCode, training results
WriterPaper outline, current section, curated referencesCode, raw experiment logs, orchestration history
ScoutSearch query, relevance criteriaCode, experiment details
CoderDesign spec, current task, error messagesPaper text, literature notes
JudgeArtifact to evaluate, evaluation criteriaCreation history, agent conversations

Why isolation matters

Context isolation prevents information contamination. The Judge doesn't know what the Coder struggled with — it evaluates the code on its merits. The Writer doesn't see messy experiment debugging — it gets clean, curated results.

Writer's Clean Context

The Writer agent deserves special attention because paper quality depends on it receiving the right information in the right form.

What the Writer Receives

  • Paper outline with section assignments
  • Curated experiment results (cleaned, formatted)
  • Selected related work summaries (pre-digested by Scout)
  • Figure descriptions and data
  • Style guidelines and venue requirements

What the Writer Never Receives

  • Raw training logs
  • Debugging transcripts
  • Failed experiment details (unless relevant to discussion)
  • Orchestrator decision history
  • Other agents' internal reasoning
# The Orchestrator curates context for the Writer:

Writer, please draft Section 4.2 (Ablation Study).

Context:
- Outline: papers/outline.md (Section 4.2 spec)
- Results: experiments/summary.yaml (ablation rows only)
- Baseline comparison: experiments/exp-001/analysis.md

Do NOT include information about implementation difficulties 
or failed configurations unless they inform the analysis.

One session per major section

The Writer starts a clean session for each major section. This prevents context from filling up mid-paper and ensures each section gets maximum context budget for quality writing.

Strategies Summary

StrategyPurposeMechanism
One-line summariesPrevent context bloatAgents return minimal info to Orchestrator
Structured disk outputEnable recoveryAll details persisted as YAML/Markdown
Stage-aware rebuildFast context initNew sessions load only current-stage data
Context isolationPrevent contaminationEach agent sees only what it needs
Clean Writer sessionsMaximize writing qualityFresh context per section

Next

  • Monitoring — how training is watched without filling context
  • Research State — the disk structures that enable all of this

AutoResearch — Multi-agent Deep Learning Research System