Skip to content

Pipeline Overview

The AutoResearch pipeline is a seven-stage state machine that drives a research project from initial idea to finished paper. Each stage has defined inputs, outputs, active agents, and a gate that controls how the transition to the next stage is handled.

Stage State Machine

mermaid
stateDiagram-v2
    [*] --> Ideation
    Ideation --> Design: Gate passes
    Design --> Implementation: Gate passes
    Implementation --> Training: Gate passes
    Training --> Analysis: Training complete
    Analysis --> Writing: Gate passes
    Writing --> Review: Gate passes
    Review --> Writing: Revisions needed
    Review --> [*]: Accepted

    note right of Ideation: Scout + Judge
    note right of Design: Planner
    note right of Implementation: Coder + Judge
    note right of Training: Coder + Monitoring
    note right of Analysis: Orchestrator + Judge
    note right of Writing: Writer + Scout
    note right of Review: Judge (3-model panel)

Stages are sequential with one exception

The pipeline flows forward through seven stages. The only backward transition is Review → Writing when revisions are needed. There is no looping back to earlier stages automatically — if fundamental issues are found, the Orchestrator (or human) decides how to handle it.

pipeline.yaml

The pipeline state is tracked in a single YAML file:

yaml
# .omc/research/pipeline.yaml
project: "flash-recurrent-attention"
current_stage: "training"
started_at: "2025-06-01T09:00:00Z"

stages:
  ideation:
    status: complete
    gate: human
    started: "2025-06-01T09:00:00Z"
    completed: "2025-06-01T14:30:00Z"
    
  design:
    status: complete
    gate: human
    started: "2025-06-01T14:30:00Z"
    completed: "2025-06-02T11:00:00Z"
    
  implementation:
    status: complete
    gate: auto-judge
    started: "2025-06-02T11:00:00Z"
    completed: "2025-06-03T16:00:00Z"
    
  training:
    status: active
    gate: auto
    started: "2025-06-03T16:00:00Z"
    completed: null
    
  analysis:
    status: pending
    gate: auto-judge
    
  writing:
    status: pending
    gate: human
    
  review:
    status: pending
    gate: human

history:
  - event: "stage_advance"
    from: "implementation"
    to: "training"
    timestamp: "2025-06-03T16:00:00Z"
    trigger: "auto-judge:pass"
    details: "Judge verdict: PASS. Code quality 8/10, tests all pass."

Three Gate Types

Gates control the transition between stages. Each stage has its own gate type.

Gate TypeSymbolBehaviorContext Cost
human🛑Pauses pipeline, presents summary, waits for human approvalZero (waits)
auto-judge🤖Dispatches Judge to evaluate, proceeds if PASSMedium (Judge invocation)
auto🚀Proceeds immediately without reviewZero
mermaid
graph LR
    S1[Stage Complete] --> G{Gate Type?}
    G -->|human| H[Wait for Human]
    G -->|auto-judge| J[Judge Evaluates]
    G -->|auto| A[Auto-Advance]
    
    H -->|approved| N[Next Stage]
    J -->|PASS| N
    J -->|REVISE| R[Retry Stage]
    J -->|FAIL| E[Escalate to Human]
    A --> N

    style H fill:#fee2e2,stroke:#dc2626
    style J fill:#fef3c7,stroke:#d97706
    style A fill:#dcfce7,stroke:#16a34a

Start restrictive, open gradually

Begin with all gates on human. As you build confidence in the system's judgment, move lower-risk stages to auto-judge, then auto. Ideation should almost always stay on human — your research direction is too important to automate.

Three Orthogonal Modes

Modes control execution behavior. They are independent axes that can be combined.

ModeAxisWhat It Controls
autopilotBetween stagesAutomatically advances through stages (respecting gates)
ralphWithin a stageTight error-fix-retry loops without human intervention
ultraworkAcross agentsDispatches multiple independent tasks in parallel

Combinations

autopilotralphultraworkBehavior
offoffoffFully manual — Orchestrator asks before every action
onoffoffAuto-advances stages, stops on errors
offonoffManual stage control, auto-fixes errors
offoffonManual control, parallel execution
ononoffAuto-advances + auto-fixes
onoffonAuto-advances + parallel execution
offononManual control + auto-fixes + parallel
onononMaximum automation

Maximum automation is not always best

autopilot + ralph + ultrawork is powerful but consumes resources aggressively and may miss subtle issues that benefit from human review. Use it for well-understood tasks (re-running baselines, standard training) and keep human gates for novel work.

Stage Summary

#StagePrimary AgentsKey OutputTypical Gate
1IdeationScout, JudgeSelected ideahuman
2DesignPlannerExperiment planhuman
3ImplementationCoder, JudgeWorking codeauto-judge
4TrainingCoder, MonitoringTrained modelauto
5AnalysisOrchestrator, JudgeResult interpretationauto-judge
6WritingWriter, ScoutPaper drafthuman
7ReviewJudge (3-model)Review + revisionhuman

Next

Explore each stage in detail:

  1. Ideation
  2. Design
  3. Implementation
  4. Training
  5. Analysis
  6. Writing
  7. Review

AutoResearch — Multi-agent Deep Learning Research System