Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Idea Generation Evaluation on SGI-bench
Loading...
80.53
Novelty
DeepInnovator
13.346
30.788
48.23
65.672
Feb 21, 2026
Novelty
Feasibility
Effectiveness
Detailedness
Updated 1mo ago
Evaluation Results
Method
Method
Links
Novelty
Feasibility
Effectiveness
Detailedness
DeepInnovator
Opponent Model=Qwen-14...
2026.02
80.53
87.61
92.92
93.81
DeepInnovator
Opponent Model=Minimax...
2026.02
59.29
10.62
50.44
61.06
DeepInnovator
Opponent Model=GPT-4o
2026.02
49.56
11.5
76.11
44.25
DeepInnovator
Opponent Model=Deepsee...
2026.02
47.79
23.01
45.13
59.29
DeepInnovator
Opponent Model=Grok-4.1
2026.02
37.17
10.62
43.36
45.13
DeepInnovator
Opponent Model=Qwen3-max
2026.02
32.74
19.47
66.37
91.15
DeepInnovator
Opponent Model=Gemini-...
2026.02
15.93
18.58
43.36
68.14
Feedback
Search any
task
Search any
task