Skip to content

Planner

The Planner takes a research direction and turns it into a concrete, actionable experiment design. It is a thinking agent — it reasons about experimental methodology but never writes code or runs experiments.

Identity

PropertyValue
LLMClaude Opus
InvocationSub-agent (spawned by Orchestrator)
LifecyclePer-task — created for a planning task, terminated when done
ContextInherits partial context from Orchestrator

When Invoked

The Planner is called at specific points in the pipeline:

StageTriggerPlanning Task
DesignIdea selectedFull experiment plan — baselines, ablations, metrics
ImplementationPlan approvedTask decomposition for Coder
AnalysisResults readyAnalysis strategy — what comparisons, what visualizations
WritingAnalysis approvedPaper outline and section structure
ReviewReviews receivedRevision plan — what to address, what to rebut

The Planner is invoked multiple times

It's not a one-shot agent. Each pipeline stage may invoke the Planner to decompose the next phase of work. The Planner adapts its output format to the task.

What It Receives Per Task

The Orchestrator curates context for each Planner invocation:

Design Stage

yaml
inputs:
  - selected_idea: ideas/selected.yaml
  - related_work: papers/related_work/summaries.yaml
  - constraints: infrastructure.yaml (GPU count, memory, time budget)
  - venue_target: "ICML 2025"

Implementation Stage

yaml
inputs:
  - experiment_plan: design/plan.md
  - baselines: design/baselines.yaml
  - codebase_summary: "PyTorch, 3 existing modules, ~2k LOC"

Analysis Stage

yaml
inputs:
  - experiment_results: experiments/summary.yaml
  - original_hypotheses: design/plan.md (hypothesis section)
  - metrics: design/metrics.yaml

What It Outputs

All output goes to disk in .omc/research/. The Orchestrator receives a one-line summary.

OutputFileContents
Experiment plandesign/plan.mdHypotheses, method description, experiment structure
Baselinesdesign/baselines.yamlName, paper, code link, expected performance
Ablationsdesign/ablations.yamlWhat to ablate, expected impact, priority
Metricsdesign/metrics.yamlPrimary/secondary metrics, statistical tests
Task listdesign/tasks.yamlNumbered tasks for Coder, with dependencies
Paper outlinepapers/outline.mdSection structure, target lengths, key claims
Revision planreviews/revision_plan.mdReview-by-review response strategy

Output is structured, not prose

The Planner writes structured YAML and focused Markdown — not open-ended essays. This makes outputs machine-readable for other agents and scannable for humans.

Example: Baselines Output

yaml
# design/baselines.yaml
baselines:
  - name: "Vanilla Transformer"
    paper: "Vaswani et al., 2017"
    code: "https://github.com/..."
    expected_ppl: 18.0
    purpose: "Standard attention baseline"
    
  - name: "Linear Transformer"
    paper: "Katharopoulos et al., 2020"
    code: "https://github.com/..."
    expected_ppl: 20.5
    purpose: "Linear attention baseline — our method should match or beat"
    
  - name: "RetNet"
    paper: "Sun et al., 2023"
    code: "https://github.com/..."
    expected_ppl: 18.5
    purpose: "Recurrent baseline — our method combines this with flash attention"

What It Does NOT Do

Clear boundaries

The Planner has strict boundaries. Violating these is an architectural error caught by the omc-orchestrator hook.

The Planner Does NOTWhy
Write codeThat's the Coder's job
Run experimentsThat's the Coder's job
Evaluate resultsThat's the Judge's job
Search literatureThat's the Scout's job
Write paper textThat's the Writer's job
Make strategic decisionsThat's the Orchestrator's job
Decide which experiment to prioritizeThat's the Orchestrator's job

The Planner designs. It says "we should run experiment X with config Y and expect result Z." It never executes that design.

What if the Planner needs literature?

If the Planner needs information about a baseline paper, it flags this to the Orchestrator in its output: "needs_info: [baseline paper X details]". The Orchestrator then dispatches Scout to fetch the information and re-invokes the Planner with the additional context. The Planner never calls Scout itself.

Next

  • Writer — the other Claude Opus agent
  • Coder — who implements the Planner's designs
  • Design Stage — where the Planner does most of its work

AutoResearch — Multi-agent Deep Learning Research System