Stage 1: Ideation
The ideation stage transforms a broad research topic into a concrete, evaluated research idea ready for experiment design.
Entering This Stage
What you have:
- A research topic or area of interest (from the user)
- Optionally: a specific angle, constraint, or inspiration
What you don't have yet:
- Specific method design
- Baselines or comparisons
- Experiment plan
Steps
graph TD
A[1. Topic Scoping] --> B[2. Literature Survey]
B --> C[3. Idea Generation]
C --> D[4. Idea Refinement]
D --> E[5. Idea Evaluation]
E --> F{Judge Verdict}
F -->|PASS| G[6. Idea Selection]
F -->|REVISE| D
F -->|FAIL| C
G --> H[Gate: Advance to Design]
style A fill:#dbeafe,stroke:#2563eb
style E fill:#fef3c7,stroke:#d97706
style G fill:#dcfce7,stroke:#16a34a1. Topic Scoping
Agent: Orchestrator (with human)
The Orchestrator discusses the research topic with the user to narrow the scope:
- What area? (e.g., "efficient attention mechanisms")
- What constraints? (e.g., "must work for language modeling, 4x A100")
- What venue? (e.g., "ICML 2025")
- Any specific angles? (e.g., "interested in combining recurrence with hardware-aware methods")
2. Literature Survey
Agent: Scout (Gemini)
The Scout searches for recent work in the scoped area:
- Top-venue papers from the last 2 years
- Key baselines and their results
- Open problems and research gaps
- Concurrent work that might overlap
Output: papers/related_work/summaries.yaml (initial survey)
3. Idea Generation
Agent: Scout (Gemini)
Based on the literature survey, the Scout generates 3-5 candidate ideas:
- Each idea has a novelty angle and expected benefit
- Ideas are creative and expansive — filtering happens later
- No code feasibility requirement at this stage
Output: ideas/brainstorm.md
4. Idea Refinement
Agent: Orchestrator (with human)
The Orchestrator and user discuss the candidate ideas:
- Which ideas are most promising?
- Can ideas be combined?
- Does the user have domain insights to add?
- What are the obvious risks?
This is an interactive step. The Orchestrator synthesizes user input with the Scout's suggestions.
Output: ideas/candidates.yaml (ranked, refined ideas)
This is where human expertise matters most
The ideation stage is where your domain knowledge has the highest leverage. A good idea with a clear novelty angle saves weeks of work. Take time here.
5. Idea Evaluation
Agent: Judge (Codex, codex exec)
The Judge independently evaluates the top 1-2 candidate ideas across five dimensions:
| Dimension | What's Assessed |
|---|---|
| Novelty | Has this been done? How different is it from prior work? |
| Feasibility | Can it be implemented within constraints (time, compute)? |
| Verifiability | Can the claims be empirically validated? |
| Attack Surface | What will reviewers criticize? |
| Impact | If it works, how significant is the contribution? |
Output: reviews/idea_review.yaml
The Judge is independent
The Judge doesn't know which idea the user prefers or what the Orchestrator recommended. It evaluates purely on the idea description and constraints.
6. Idea Selection
Agent: Orchestrator (with human at human gate)
The Orchestrator presents:
- The refined ideas
- The Judge's evaluation for each
- A recommendation
The user makes the final selection.
Output: ideas/selected.yaml
# ideas/selected.yaml
selected_idea:
title: "Flash-Recurrent Attention"
description: |
Combine flash attention's IO-aware tiling strategy with RetNet's
recurrent formulation to achieve O(n) memory with hardware-efficient
compute patterns.
novelty: "Integration of flash attention tiling with recurrent attention is unexplored"
judge_verdict: PASS
judge_scores:
novelty: 7
feasibility: 8
verifiability: 9
impact: 6
selection_rationale: |
Best balance of novelty and feasibility. Judge's attack surface
analysis identified manageable risks. User has relevant domain expertise.Gate
| Gate Type | Recommended | Behavior |
|---|---|---|
human | Yes | User reviews selected idea before advancing |
auto-judge | Possible | Judge verdict determines advancement |
auto | Not recommended | Research direction should not be automated |
Always use human gate for ideation
Your research direction is the most important decision in the entire project. Automating this saves minutes but risks weeks of wasted work on a bad idea.
Error Handling
| Error | Recovery |
|---|---|
| Scout returns no relevant papers | Broaden search terms, try adjacent topics |
| All ideas score low on novelty | Scout searches for more recent work, try different angles |
| Judge gives FAIL verdict | Return to step 3 with Judge's feedback as additional context |
| User rejects all ideas | Return to step 1 with refined topic scope |
Next Stage
When the gate passes, the pipeline advances to Design with the selected idea in ideas/selected.yaml.