Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Research Idea Generation on Expert Evaluation Law Domain (test)
Loading...
69.2
Novelty (Winrate vs Qwen)
DeepInnovator
65.74
67.47
69.2
70.93
Feb 21, 2026
Novelty (Winrate vs Qwen)
Novelty (Winrate vs GPT-4o)
Feasibility (Winrate vs Qwen)
Feasibility (Winrate vs GPT-4o)
Effectiveness (Winrate vs Qwen)
Effectiveness (Winrate vs GPT-4o)
Detailedness (Winrate vs Qwen)
Detailedness (Winrate vs GPT-4o)
Updated 4d ago
Evaluation Results
Method
Method
Links
Novelty (Winrate vs Qwen)
Novelty (Winrate vs GPT-4o)
Feasibility (Winrate vs Qwen)
Feasibility (Winrate vs GPT-4o)
Effectiveness (Winrate vs Qwen)
Effectiveness (Winrate vs GPT-4o)
Detailedness (Winrate vs Qwen)
Detailedness (Winrate vs GPT-4o)
DeepInnovator
evaluation_mode=Human...
2026.02
69.2
53.8
60
33.3
70
45.5
76.9
50
Feedback
Search any
task
Search any
task