Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Research Idea Ranking on D_group
Loading...
76.03
Listwise Score
InnoEval
56.8524
61.8312
66.81
71.7888
Feb 16, 2026
Listwise Score
Accuracy
Updated 3mo ago
Evaluation Results
Method
Method
Links
Listwise Score
Accuracy
InnoEval
Backbone=DeepSeek-V3.2
2026.02
76.03
0.2209
ScholarEval
Backbone=DeepSeek-V3.2
2026.02
70.13
0.1453
InternAgent
Backbone=DeepSeek-V3.2
2026.02
68.46
0.1047
ResearchAgent
Backbone=DeepSeek-V3.2
2026.02
66.52
0.0814
RAG
Backbone=DeepSeek-V3.2
2026.02
57.88
0.0756
GraphEval
Backbone=DeepSeek-V3.2
2026.02
57.7
0.0233
CoT
Backbone=DeepSeek-V3.2
2026.02
57.59
0.0698
Feedback
Search any
task
Search any
task