Pipeline Overview
The AutoResearch pipeline is a seven-stage state machine that drives a research project from initial idea to finished paper. Each stage has defined inputs, outputs, active agents, and a gate that controls how the transition to the next stage is handled.
Stage State Machine
stateDiagram-v2
[*] --> Ideation
Ideation --> Design: Gate passes
Design --> Implementation: Gate passes
Implementation --> Training: Gate passes
Training --> Analysis: Training complete
Analysis --> Writing: Gate passes
Writing --> Review: Gate passes
Review --> Writing: Revisions needed
Review --> [*]: Accepted
note right of Ideation: Scout + Judge
note right of Design: Planner
note right of Implementation: Coder + Judge
note right of Training: Coder + Monitoring
note right of Analysis: Orchestrator + Judge
note right of Writing: Writer + Scout
note right of Review: Judge (3-model panel)Stages are sequential with one exception
The pipeline flows forward through seven stages. The only backward transition is Review → Writing when revisions are needed. There is no looping back to earlier stages automatically — if fundamental issues are found, the Orchestrator (or human) decides how to handle it.
pipeline.yaml
The pipeline state is tracked in a single YAML file:
# .omc/research/pipeline.yaml
project: "flash-recurrent-attention"
current_stage: "training"
started_at: "2025-06-01T09:00:00Z"
stages:
ideation:
status: complete
gate: human
started: "2025-06-01T09:00:00Z"
completed: "2025-06-01T14:30:00Z"
design:
status: complete
gate: human
started: "2025-06-01T14:30:00Z"
completed: "2025-06-02T11:00:00Z"
implementation:
status: complete
gate: auto-judge
started: "2025-06-02T11:00:00Z"
completed: "2025-06-03T16:00:00Z"
training:
status: active
gate: auto
started: "2025-06-03T16:00:00Z"
completed: null
analysis:
status: pending
gate: auto-judge
writing:
status: pending
gate: human
review:
status: pending
gate: human
history:
- event: "stage_advance"
from: "implementation"
to: "training"
timestamp: "2025-06-03T16:00:00Z"
trigger: "auto-judge:pass"
details: "Judge verdict: PASS. Code quality 8/10, tests all pass."Three Gate Types
Gates control the transition between stages. Each stage has its own gate type.
| Gate Type | Symbol | Behavior | Context Cost |
|---|---|---|---|
| human | 🛑 | Pauses pipeline, presents summary, waits for human approval | Zero (waits) |
| auto-judge | 🤖 | Dispatches Judge to evaluate, proceeds if PASS | Medium (Judge invocation) |
| auto | 🚀 | Proceeds immediately without review | Zero |
graph LR
S1[Stage Complete] --> G{Gate Type?}
G -->|human| H[Wait for Human]
G -->|auto-judge| J[Judge Evaluates]
G -->|auto| A[Auto-Advance]
H -->|approved| N[Next Stage]
J -->|PASS| N
J -->|REVISE| R[Retry Stage]
J -->|FAIL| E[Escalate to Human]
A --> N
style H fill:#fee2e2,stroke:#dc2626
style J fill:#fef3c7,stroke:#d97706
style A fill:#dcfce7,stroke:#16a34aStart restrictive, open gradually
Begin with all gates on human. As you build confidence in the system's judgment, move lower-risk stages to auto-judge, then auto. Ideation should almost always stay on human — your research direction is too important to automate.
Three Orthogonal Modes
Modes control execution behavior. They are independent axes that can be combined.
| Mode | Axis | What It Controls |
|---|---|---|
| autopilot | Between stages | Automatically advances through stages (respecting gates) |
| ralph | Within a stage | Tight error-fix-retry loops without human intervention |
| ultrawork | Across agents | Dispatches multiple independent tasks in parallel |
Combinations
| autopilot | ralph | ultrawork | Behavior |
|---|---|---|---|
| off | off | off | Fully manual — Orchestrator asks before every action |
| on | off | off | Auto-advances stages, stops on errors |
| off | on | off | Manual stage control, auto-fixes errors |
| off | off | on | Manual control, parallel execution |
| on | on | off | Auto-advances + auto-fixes |
| on | off | on | Auto-advances + parallel execution |
| off | on | on | Manual control + auto-fixes + parallel |
| on | on | on | Maximum automation |
Maximum automation is not always best
autopilot + ralph + ultrawork is powerful but consumes resources aggressively and may miss subtle issues that benefit from human review. Use it for well-understood tasks (re-running baselines, standard training) and keep human gates for novel work.
Stage Summary
| # | Stage | Primary Agents | Key Output | Typical Gate |
|---|---|---|---|---|
| 1 | Ideation | Scout, Judge | Selected idea | human |
| 2 | Design | Planner | Experiment plan | human |
| 3 | Implementation | Coder, Judge | Working code | auto-judge |
| 4 | Training | Coder, Monitoring | Trained model | auto |
| 5 | Analysis | Orchestrator, Judge | Result interpretation | auto-judge |
| 6 | Writing | Writer, Scout | Paper draft | human |
| 7 | Review | Judge (3-model) | Review + revision | human |
Next
Explore each stage in detail: