Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Review Quality Evaluation on Scientific Papers 200 sampled papers (random sample)
Loading...
100
Technical Depth
GraphReview
58.4
69.2
80
90.8
May 26, 2026
Technical Depth
Evidence Grounding
Scientific Rigor
Revision Utility
Overall Preference
Updated 7d ago
Evaluation Results
Method
Method
Links
Technical Depth
Evidence Grounding
Scientific Rigor
Revision Utility
Overall Preference
GraphReview
vs Baseline=CNPE-7B
2026.05
100
100
100
100
100
GraphReview
vs Baseline=DeepReview-7B
2026.05
99.5
99.5
99.5
99.5
99.5
GraphReview
vs Baseline=DeepReview...
2026.05
99
96
99
98
99
GraphReview
vs Baseline=DeepSeek V3.2
2026.05
98
94.5
98
99
98
GraphReview
vs Baseline=Gemini-2.5...
2026.05
97.5
93.5
98
98.5
98
GraphReview
vs Baseline=GPT-5-Mini
2026.05
60
55
62
76
62
Feedback
Search any
task
Search any
task