Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Research Idea Generation on Expert Evaluation Education Domain (test)
Loading...
80
Novelty Winrate vs Qwen
DeepInnovator
76
78
80
82
Feb 21, 2026
Novelty Winrate vs Qwen
Novelty Winrate vs GPT-4o
Feasibility Winrate vs Qwen
Feasibility Winrate vs GPT-4o
Effectiveness Winrate vs Qwen
Effectiveness Winrate vs GPT-4o
Detailedness Winrate vs Qwen
Detailedness Winrate vs GPT-4o
Updated 1mo ago
Evaluation Results
Method
Method
Links
Novelty Winrate vs Qwen
Novelty Winrate vs GPT-4o
Feasibility Winrate vs Qwen
Feasibility Winrate vs GPT-4o
Effectiveness Winrate vs Qwen
Effectiveness Winrate vs GPT-4o
Detailedness Winrate vs Qwen
Detailedness Winrate vs GPT-4o
DeepInnovator
evaluation_mode=Human...
2026.02
80
60
57.1
0
53.3
0
75
36.4
Feedback
Search any
task
Search any
task