Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Anonymous Peer Review Dataset

Benchmarks

Task NameDataset NameSOTA ResultTrend
Peer Review EvaluationAnonymous Peer Review Dataset Overall Judgment
DeepReviewer 2.0 Win Rate69.77
1
Peer Review EvaluationAnonymous Peer Review Dataset Communication Clarity
DeepReviewer 2.0 Win Rate86.05
1
Peer Review EvaluationAnonymous Peer Review Dataset Analytical Depth
DeepReviewer 2.0 Win Rate58.14
1
Peer Review EvaluationAnonymous Peer Review Dataset Constructive Value
DeepReviewer 2.0 Win Rate84.5
1
Peer Review EvaluationAnonymous Peer Review Dataset Technical Accuracy
DeepReviewer 2.0 Win Rate59.69
1
Peer Review EvaluationAnonymous Peer Review Dataset All Dimensions micro
DeepReviewer 2.0 Win Rate71.63
1
Showing 6 of 6 rows