Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Research Idea Generation on Expert Evaluation Law Domain (test)
Loading...
69.2
Novelty (Winrate vs Qwen)
DeepInnovator
65.74
67.47
69.2
70.93
Feb 21, 2026
Novelty (Winrate vs Qwen)
Novelty (Winrate vs GPT-4o)
Feasibility (Winrate vs Qwen)
Feasibility (Winrate vs GPT-4o)
Effectiveness (Winrate vs Qwen)
Effectiveness (Winrate vs GPT-4o)
Detailedness (Winrate vs Qwen)
Detailedness (Winrate vs GPT-4o)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Novelty (Winrate vs Qwen)
Novelty (Winrate vs GPT-4o)
Feasibility (Winrate vs Qwen)
Feasibility (Winrate vs GPT-4o)
Effectiveness (Winrate vs Qwen)
Effectiveness (Winrate vs GPT-4o)
Detailedness (Winrate vs Qwen)
Detailedness (Winrate vs GPT-4o)
DeepInnovator
evaluation_mode=Human...
2026.02
69.2
53.8
60
33.3
70
45.5
76.9
50
Feedback
Search any
task
Search any
task