Skip to content

Stage 1: Ideation

The ideation stage transforms a broad research topic into a concrete, evaluated research idea ready for experiment design.

Entering This Stage

What you have:

  • A research topic or area of interest (from the user)
  • Optionally: a specific angle, constraint, or inspiration

What you don't have yet:

  • Specific method design
  • Baselines or comparisons
  • Experiment plan

Steps

mermaid
graph TD
    A[1. Topic Scoping] --> B[2. Literature Survey]
    B --> C[3. Idea Generation]
    C --> D[4. Idea Refinement]
    D --> E[5. Idea Evaluation]
    E --> F{Judge Verdict}
    F -->|PASS| G[6. Idea Selection]
    F -->|REVISE| D
    F -->|FAIL| C
    G --> H[Gate: Advance to Design]

    style A fill:#dbeafe,stroke:#2563eb
    style E fill:#fef3c7,stroke:#d97706
    style G fill:#dcfce7,stroke:#16a34a

1. Topic Scoping

Agent: Orchestrator (with human)

The Orchestrator discusses the research topic with the user to narrow the scope:

  • What area? (e.g., "efficient attention mechanisms")
  • What constraints? (e.g., "must work for language modeling, 4x A100")
  • What venue? (e.g., "ICML 2025")
  • Any specific angles? (e.g., "interested in combining recurrence with hardware-aware methods")

2. Literature Survey

Agent: Scout (Gemini)

The Scout searches for recent work in the scoped area:

  • Top-venue papers from the last 2 years
  • Key baselines and their results
  • Open problems and research gaps
  • Concurrent work that might overlap

Output: papers/related_work/summaries.yaml (initial survey)

3. Idea Generation

Agent: Scout (Gemini)

Based on the literature survey, the Scout generates 3-5 candidate ideas:

  • Each idea has a novelty angle and expected benefit
  • Ideas are creative and expansive — filtering happens later
  • No code feasibility requirement at this stage

Output: ideas/brainstorm.md

4. Idea Refinement

Agent: Orchestrator (with human)

The Orchestrator and user discuss the candidate ideas:

  • Which ideas are most promising?
  • Can ideas be combined?
  • Does the user have domain insights to add?
  • What are the obvious risks?

This is an interactive step. The Orchestrator synthesizes user input with the Scout's suggestions.

Output: ideas/candidates.yaml (ranked, refined ideas)

This is where human expertise matters most

The ideation stage is where your domain knowledge has the highest leverage. A good idea with a clear novelty angle saves weeks of work. Take time here.

5. Idea Evaluation

Agent: Judge (Codex, codex exec)

The Judge independently evaluates the top 1-2 candidate ideas across five dimensions:

DimensionWhat's Assessed
NoveltyHas this been done? How different is it from prior work?
FeasibilityCan it be implemented within constraints (time, compute)?
VerifiabilityCan the claims be empirically validated?
Attack SurfaceWhat will reviewers criticize?
ImpactIf it works, how significant is the contribution?

Output: reviews/idea_review.yaml

The Judge is independent

The Judge doesn't know which idea the user prefers or what the Orchestrator recommended. It evaluates purely on the idea description and constraints.

6. Idea Selection

Agent: Orchestrator (with human at human gate)

The Orchestrator presents:

  • The refined ideas
  • The Judge's evaluation for each
  • A recommendation

The user makes the final selection.

Output: ideas/selected.yaml

yaml
# ideas/selected.yaml
selected_idea:
  title: "Flash-Recurrent Attention"
  description: |
    Combine flash attention's IO-aware tiling strategy with RetNet's 
    recurrent formulation to achieve O(n) memory with hardware-efficient 
    compute patterns.
  novelty: "Integration of flash attention tiling with recurrent attention is unexplored"
  judge_verdict: PASS
  judge_scores:
    novelty: 7
    feasibility: 8
    verifiability: 9
    impact: 6
  selection_rationale: |
    Best balance of novelty and feasibility. Judge's attack surface 
    analysis identified manageable risks. User has relevant domain expertise.

Gate

Gate TypeRecommendedBehavior
humanYesUser reviews selected idea before advancing
auto-judgePossibleJudge verdict determines advancement
autoNot recommendedResearch direction should not be automated

Always use human gate for ideation

Your research direction is the most important decision in the entire project. Automating this saves minutes but risks weeks of wasted work on a bad idea.

Error Handling

ErrorRecovery
Scout returns no relevant papersBroaden search terms, try adjacent topics
All ideas score low on noveltyScout searches for more recent work, try different angles
Judge gives FAIL verdictReturn to step 3 with Judge's feedback as additional context
User rejects all ideasReturn to step 1 with refined topic scope

Next Stage

When the gate passes, the pipeline advances to Design with the selected idea in ideas/selected.yaml.

AutoResearch — Multi-agent Deep Learning Research System