Skip to content

Commands & Interaction

AutoResearch is designed around natural conversation. You can use slash commands for precise control, magic keywords to activate execution modes, or just talk in plain language. The Orchestrator understands all three and dispatches the right agents.

Slash Commands

Slash commands give you explicit, predictable control over the system.

CommandAction
/research init <name> <desc>Create a new research project with the given name and description
/statusShow current project stage, gate configuration, and recent tasks
/gateView current gate configuration for all stages
/gate autoSet all stages to fully automatic
/gate humanSet all stages to require human approval
/gate defaultRestore the default gate configuration
/gate <stage> <type>Change a specific stage's gate (e.g. /gate training auto)

Commands run in the Orchestrator session

All commands are issued to the Orchestrator (the main Claude Code session). The Orchestrator parses them and dispatches work to the appropriate agents. You never need to talk to agents directly.

/research init

Creates the project directory structure under ~/Claude/Harness/, initializes pipeline.yaml, and enters the idle state. The Orchestrator is ready to receive your first research direction.

/research init sparse-moe "Sparse MoE training efficiency"

/status

Prints a snapshot of the current project. Useful any time you want to know where things stand.

> /status
Project: sparse-moe
Stage: training
Gate config: ideation=human, design=human, training=auto, ...
Active: exp-003 training on ic2 (step 45000/100000, loss 2.31)
Last event: CronCreate patrol — all healthy (12 min ago)

/gate

Gates control how much human involvement each stage requires. You can change them at any time, even mid-pipeline.

/gate                        # Show all gates
/gate auto                   # Everything automatic
/gate human                  # Everything needs approval
/gate default                # Restore defaults
/gate implementation auto-judge   # Just this stage

Magic Keywords

Type these keywords directly in the conversation to activate execution modes. They are orthogonal — you can combine them freely.

KeywordEffect
autopilotAuto-advance through pipeline stages, pausing only at gates
ralphLoop within the current stage until the task is done
ultraworkExecute multiple independent tasks in parallel
cancelomcStop whatever active mode is currently running

Adding a description after the keyword

You can append a description to help the system understand what "done" means. This is especially useful for ralph and autopilot.

autopilot: 从 ideation 做到论文初稿
ralph: 训练 ResNet-50 直到 top-1 超过 76%
ultrawork: 并行跑 ablation A/B/C

autopilot

Drives the pipeline forward across stages. At each stage boundary, the Orchestrator checks the gate:

  • human gate → pauses and waits for you
  • auto-judge gate → Judge evaluates and decides
  • auto gate → proceeds immediately

Autopilot does not skip gates. It simply removes the need for you to type "next" after every stage.

ralph

Named after the "do it again" philosophy. Ralph enters a tight loop within the current stage:

  1. Execute the task
  2. Evaluate the result (via Judge or self-check)
  3. If not done, fix issues and repeat from step 1
  4. If done, exit the loop

Ralph is perfect for iterative tasks: training until a metric is hit, revising a paper until reviewers pass, fixing code until tests are green.

ultrawork

Spawns multiple parallel agents for independent tasks. The Orchestrator splits the work, assigns each piece to a separate agent, and collects results when all finish.

Ideal for ablation studies, parallel literature searches, or running multiple experiments simultaneously.

cancelomc

Emergency brake. Stops whichever mode is currently active (autopilot, ralph, or ultrawork) and returns you to manual control. In-flight agent tasks are allowed to finish their current step gracefully.

Natural Language

You don't need to memorize commands. Just describe what you want in plain language — the Orchestrator maps your intent to the right actions.

What you sayWhat happens
"新建研究项目,方向是 sparse MoE"Creates project, enters IDEATION stage
"搜一下 MoE 训练效率的最新论文"Dispatches Scout (Gemini) for literature search
"选第三个 idea"Selects idea, advances stage
"开始训练"Dispatches Coder to launch training
"写论文"Dispatches Writer with clean context
"停一下,learning rate 太大了"Interrupts current task, you take over

Mixed languages are fine

The Orchestrator handles Chinese, English, and mixed input. Use whatever is natural for you.

Usage Scenarios

Scenario 1: Morning hands-on ideation

You have a vague direction and want to explore it interactively. Keep gates on human so you stay in the loop.

> cd ~/Claude/Harness && mkdir sparse-moe && cd sparse-moe
> /research init sparse-moe "Sparse MoE training efficiency"
  → Project initialized. Stage: idle

> 搜索 MoE 训练效率相关的最新工作
  → scout (Gemini) searching...
  → 5 candidate ideas proposed
  → judge (Codex) evaluating in parallel...
  → Results ready. Here are the 5 ideas with scores...

> 第三个不错,选这个
  → idea-003 selected, scout finding baseline...
  → Baseline found: "EfficientMoE" (NeurIPS 2025, code: github.com/xxx)

> /gate default
  → Gates set to default configuration

> 确认 baseline,开始设计实验
  → scout digesting baseline paper...
  → planner designing experiment...
  → Experiment plan ready. Please review...

Morning sessions are collaborative

During ideation, your domain expertise combined with the system's breadth produces better ideas than either alone. Stay engaged.

Scenario 2: Evening autopilot overnight

Your experiment design is approved. Set gates to auto and let the system work while you sleep.

> /gate auto
  → All gates set to auto

> autopilot: 实现代码、跑训练、分析结果
  → Entering autopilot mode...
  → [coder implementing...]
  → [coder launching training...]
  → [Phase 1 active watch... stable after 15 min]
  → [Phase 2 CronCreate patrol...]
  → Telegram: "Training complete. Results: 78.3% top-1 (baseline 76.0%)"
  → Telegram: "Judge verdict: PASS. Claims supported."

Check in the morning

Autopilot pauses at human gates and when errors exceed retry limits. Run /status when you wake up to see where things stand.

Scenario 3: Intervene anytime

You can always step in, even during autopilot. Human input takes priority over all automated decisions.

> 停一下,learning rate 改成 1e-4 重跑
  → Interrupting... coder adjusting params...

> /gate human
  → All gates set to human. Full manual control.

> /status
  → Project: sparse-moe
  → Stage: training
  → Gate config: all human
  → Active: exp-003 training on ic2

You are always the highest authority

The system pauses, incorporates your input, and continues. No work is lost. You can switch between manual and automatic control at any granularity, at any time.

Scenario 4: Ralph for paper revision

Use ralph to loop on paper revisions until the three-model review panel is satisfied.

> ralph: 修改论文直到三模型审稿全通过
  → [writer revising based on review comments...]
  → [three-model re-review...]
  → [2 issues remain, writer fixing...]
  → [three-model re-review...]
  → [All three reviewers: PASS]
  → Ralph complete.

Ralph automatically decides when the loop is done based on the description you provided. In this case, "done" means all three model reviewers return PASS.

Scenario 5: Ultrawork for parallel ablations

Use ultrawork to run independent experiments simultaneously.

> ultrawork: 并行跑 ablation A(去掉attention) B(去掉routing) C(去掉loss term)
  → Spawning 3 parallel coder agents...
  → [coder-A running ablation A...]
  → [coder-B running ablation B...]
  → [coder-C running ablation C...]
  → All 3 ablations complete. Results collected.

Each ablation runs in its own isolated workspace. Results are aggregated when all agents finish.

Default Gate Configuration

The default configuration balances safety with automation. Critical decision points require human input; routine execution is automated.

yaml
ideation: human          # Must confirm idea selection
baseline-digestion: auto # Scout digests, no approval needed
design: human            # Must review experiment design
implementation: auto-judge
training: auto-judge
analysis: human          # Must confirm claims
writing: auto-judge
review: human            # Must approve submission
Recommended progression

As you gain confidence in the system, gradually open gates:

  1. First project — all human. Learn how the system works.
  2. Second projectauto-judge for implementation and analysis.
  3. Established workflowauto for training, auto-judge for implementation/analysis.
  4. Never set ideation to auto — your research taste is the most valuable input.

Combining Modes

The three execution modes are orthogonal. Combine them for different levels of automation:

CombinationBehavior
autopilot aloneAdvances stages, single agent per task
autopilot + ralphAdvances stages, auto-fixes errors within stages
autopilot + ultraworkAdvances stages, parallel agents where possible
All threeMaximum automation — parallel execution, auto-fix, auto-advance
NoneFully manual — you direct every step

Start conservative

Begin with ralph only (loop within a stage, but you manually advance). Once comfortable, add autopilot. Add ultrawork last, when you trust the system to manage parallel workloads.

Quick Reference Card

┌─────────────────────────────────────────────────────┐
│  SLASH COMMANDS                                     │
│  /research init <name> <desc>   Create project      │
│  /status                        Current state        │
│  /gate [auto|human|default]     Gate control          │
│  /gate <stage> <type>           Per-stage gate        │
│                                                     │
│  MAGIC KEYWORDS                                     │
│  autopilot [: desc]    Auto-advance stages           │
│  ralph [: desc]        Loop until done               │
│  ultrawork [: desc]    Parallel execution             │
│  cancelomc             Stop active mode              │
│                                                     │
│  NATURAL LANGUAGE                                   │
│  Just describe what you want. The system figures     │
│  out the rest. Works in any language.               │
└─────────────────────────────────────────────────────┘

Next Steps

AutoResearch — Multi-agent Deep Learning Research System