Stage 1: Ideation

The ideation stage transforms a broad research topic into a concrete, evaluated research idea ready for experiment design.

Entering This Stage

What you have:

A research topic or area of interest (from the user)
Optionally: a specific angle, constraint, or inspiration

What you don't have yet:

Specific method design
Baselines or comparisons
Experiment plan

Steps

mermaid

graph TD
    A[1. Topic Scoping] --> B[2. Literature Survey]
    B --> C[3. Idea Generation]
    C --> D[4. Idea Refinement]
    D --> E[5. Idea Evaluation]
    E --> F{Judge Verdict}
    F -->|PASS| G[6. Idea Selection]
    F -->|REVISE| D
    F -->|FAIL| C
    G --> H[Gate: Advance to Baseline-Digestion]

    style A fill:#dbeafe,stroke:#2563eb
    style E fill:#fef3c7,stroke:#d97706
    style G fill:#dcfce7,stroke:#16a34a

1. Topic Scoping

Agent: Orchestrator (with human)

The Orchestrator discusses the research topic with the user to narrow the scope:

What area? (e.g., "efficient attention mechanisms")
What constraints? (e.g., "must work for language modeling, 4x A100")
What venue? (e.g., "ICML 2025")
Any specific angles? (e.g., "interested in combining recurrence with hardware-aware methods")

2. Literature Survey

Agent: Scout (Gemini)

The Scout searches for recent work in the scoped area:

Top-venue papers from the last 2 years
Key baselines and their results
Open problems and research gaps
Concurrent work that might overlap

Output: papers/related_work/summaries.yaml (initial survey)

3. Idea Generation

Agent: Scout (Gemini)

Based on the literature survey, the Scout generates 3-5 candidate ideas:

Each idea has a novelty angle and expected benefit
Ideas are creative and expansive — filtering happens later
No code feasibility requirement at this stage

Output: ideas/brainstorm.md

Agent: Orchestrator (with human)

The Orchestrator and user discuss the candidate ideas:

Which ideas are most promising?
Can ideas be combined?
Does the user have domain insights to add?
What are the obvious risks?

This is an interactive step. The Orchestrator synthesizes user input with the Scout's suggestions.

Output: ideas/candidates.yaml (ranked, refined ideas)

This is where human expertise matters most

The ideation stage is where your domain knowledge has the highest leverage. A good idea with a clear novelty angle saves weeks of work. Take time here.

5. Idea Evaluation

Agent: Judge (Codex, codex exec)

The Judge independently evaluates the top 1-2 candidate ideas across five dimensions:

Dimension	What's Assessed
Novelty	Has this been done? How different is it from prior work?
Feasibility	Can it be implemented within constraints (time, compute)?
Verifiability	Can the claims be empirically validated?
Attack Surface	What will reviewers criticize?
Impact	If it works, how significant is the contribution?

Output: reviews/idea_review.yaml

The Judge is independent

The Judge doesn't know which idea the user prefers or what the Orchestrator recommended. It evaluates purely on the idea description and constraints.

6. Idea Selection

Agent: Orchestrator (with human at human gate)

The Orchestrator presents:

The refined ideas
The Judge's evaluation for each
A recommendation

The user makes the final selection.

Output: ideas/selected.yaml

yaml

# ideas/selected.yaml
selected_idea:
  title: "Flash-Recurrent Attention"
  description: |
    Combine flash attention's IO-aware tiling strategy with RetNet's 
    recurrent formulation to achieve O(n) memory with hardware-efficient 
    compute patterns.
  novelty: "Integration of flash attention tiling with recurrent attention is unexplored"
  judge_verdict: PASS
  judge_scores:
    novelty: 7
    feasibility: 8
    verifiability: 9
    impact: 6
  selection_rationale: |
    Best balance of novelty and feasibility. Judge's attack surface 
    analysis identified manageable risks. User has relevant domain expertise.

Gate

Gate Type	Recommended	Behavior
`human`	Yes	User reviews selected idea before advancing
`auto-judge`	Possible	Judge verdict determines advancement
`auto`	Not recommended	Research direction should not be automated

Always use human gate for ideation

Your research direction is the most important decision in the entire project. Automating this saves minutes but risks weeks of wasted work on a bad idea.

Error Handling

Error	Recovery
Scout returns no relevant papers	Broaden search terms, try adjacent topics
All ideas score low on novelty	Scout searches for more recent work, try different angles
Judge gives FAIL verdict	Return to step 3 with Judge's feedback as additional context
User rejects all ideas	Return to step 1 with refined topic scope

Next Stage

When the gate passes, the pipeline advances to Baseline-Digestion with the selected idea and baseline paper in ideas/selected.yaml.

Stage 1: Ideation ​

Entering This Stage ​

Steps ​

1. Topic Scoping ​

2. Literature Survey ​

3. Idea Generation ​

4. Idea Refinement ​

5. Idea Evaluation ​

6. Idea Selection ​

Gate ​

Error Handling ​

Next Stage ​