Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Binary Research Idea Classification on D_point NeurIPS25 and ICLR25
Loading...
75.58
Acc2
InnoEval
39.1488
48.6069
58.065
67.5231
Feb 16, 2026
Acc2
F1 Score 2
Updated 3mo ago
Evaluation Results
Method
Method
Links
Acc2
F1 Score 2
InnoEval
Backbone=DeepSeek-V3.2
2026.02
75.58
75.74
ScholarEval
Backbone=DeepSeek-V3.2
2026.02
65.44
65.02
InternAgent
Backbone=DeepSeek-V3.2
2026.02
60.37
60.12
ResearchAgent
Backbone=DeepSeek-V3.2
2026.02
59.48
56.44
GraphEval
Backbone=DeepSeek-V3.2
2026.02
54.83
45.71
RAG
Backbone=DeepSeek-V3.2
2026.02
42.4
38.29
CoT
Backbone=DeepSeek-V3.2
2026.02
40.55
35.98
Feedback
Search any
task
Search any
task