Pipeline Overview

The AutoResearch pipeline is a seven-stage state machine that drives a research project from initial idea to finished paper. Each stage has defined inputs, outputs, active agents, and a gate that controls how the transition to the next stage is handled.

Stage State Machine

mermaid

stateDiagram-v2
    [*] --> Ideation
    Ideation --> Design: Gate passes
    Design --> Implementation: Gate passes
    Implementation --> Training: Gate passes
    Training --> Analysis: Training complete
    Analysis --> Writing: Gate passes
    Writing --> Review: Gate passes
    Review --> Writing: Revisions needed
    Review --> [*]: Accepted

    note right of Ideation: Scout + Judge
    note right of Design: Planner
    note right of Implementation: Coder + Judge
    note right of Training: Coder + Monitoring
    note right of Analysis: Orchestrator + Judge
    note right of Writing: Writer + Scout
    note right of Review: Judge (3-model panel)

Stages are sequential with one exception

The pipeline flows forward through seven stages. The only backward transition is Review → Writing when revisions are needed. There is no looping back to earlier stages automatically — if fundamental issues are found, the Orchestrator (or human) decides how to handle it.

pipeline.yaml

The pipeline state is tracked in a single YAML file:

yaml

# .omc/research/pipeline.yaml
project: "flash-recurrent-attention"
current_stage: "training"
started_at: "2025-06-01T09:00:00Z"

stages:
  ideation:
    status: complete
    gate: human
    started: "2025-06-01T09:00:00Z"
    completed: "2025-06-01T14:30:00Z"
    
  design:
    status: complete
    gate: human
    started: "2025-06-01T14:30:00Z"
    completed: "2025-06-02T11:00:00Z"
    
  implementation:
    status: complete
    gate: auto-judge
    started: "2025-06-02T11:00:00Z"
    completed: "2025-06-03T16:00:00Z"
    
  training:
    status: active
    gate: auto
    started: "2025-06-03T16:00:00Z"
    completed: null
    
  analysis:
    status: pending
    gate: auto-judge
    
  writing:
    status: pending
    gate: human
    
  review:
    status: pending
    gate: human

history:
  - event: "stage_advance"
    from: "implementation"
    to: "training"
    timestamp: "2025-06-03T16:00:00Z"
    trigger: "auto-judge:pass"
    details: "Judge verdict: PASS. Code quality 8/10, tests all pass."

Three Gate Types

Gates control the transition between stages. Each stage has its own gate type.

Gate Type	Symbol	Behavior	Context Cost
human	🛑	Pauses pipeline, presents summary, waits for human approval	Zero (waits)
auto-judge	🤖	Dispatches Judge to evaluate, proceeds if PASS	Medium (Judge invocation)
auto	🚀	Proceeds immediately without review	Zero

mermaid

graph LR
    S1[Stage Complete] --> G{Gate Type?}
    G -->|human| H[Wait for Human]
    G -->|auto-judge| J[Judge Evaluates]
    G -->|auto| A[Auto-Advance]
    
    H -->|approved| N[Next Stage]
    J -->|PASS| N
    J -->|REVISE| R[Retry Stage]
    J -->|FAIL| E[Escalate to Human]
    A --> N

    style H fill:#fee2e2,stroke:#dc2626
    style J fill:#fef3c7,stroke:#d97706
    style A fill:#dcfce7,stroke:#16a34a

Start restrictive, open gradually

Begin with all gates on human. As you build confidence in the system's judgment, move lower-risk stages to auto-judge, then auto. Ideation should almost always stay on human — your research direction is too important to automate.

Three Orthogonal Modes

Modes control execution behavior. They are independent axes that can be combined.

Mode	Axis	What It Controls
autopilot	Between stages	Automatically advances through stages (respecting gates)
ralph	Within a stage	Tight error-fix-retry loops without human intervention
ultrawork	Across agents	Dispatches multiple independent tasks in parallel

Combinations

autopilot	ralph	ultrawork	Behavior
off	off	off	Fully manual — Orchestrator asks before every action
on	off	off	Auto-advances stages, stops on errors
off	on	off	Manual stage control, auto-fixes errors
off	off	on	Manual control, parallel execution
on	on	off	Auto-advances + auto-fixes
on	off	on	Auto-advances + parallel execution
off	on	on	Manual control + auto-fixes + parallel
on	on	on	Maximum automation

Maximum automation is not always best

autopilot + ralph + ultrawork is powerful but consumes resources aggressively and may miss subtle issues that benefit from human review. Use it for well-understood tasks (re-running baselines, standard training) and keep human gates for novel work.

Stage Summary

#	Stage	Primary Agents	Key Output	Typical Gate
1	Ideation	Scout, Judge	Selected idea	`human`
2	Design	Planner	Experiment plan	`human`
3	Implementation	Coder, Judge	Working code	`auto-judge`
4	Training	Coder, Monitoring	Trained model	`auto`
5	Analysis	Orchestrator, Judge	Result interpretation	`auto-judge`
6	Writing	Writer, Scout	Paper draft	`human`
7	Review	Judge (3-model)	Review + revision	`human`

Explore each stage in detail:

Pipeline Overview ​

Stage State Machine ​

pipeline.yaml ​

Three Gate Types ​

Three Orthogonal Modes ​

Combinations ​

Stage Summary ​

Next ​