| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Quora 915 queries (test) | LogicJudge | Agreement Accuracy73 | 19 | 4d ago | |
| Zhihu 847 queries (test) | LogicJudge | Agreement Accuracy0.75 | 19 | 4d ago | |
| DeepResearch 1319 queries (test) | LogicJudge | Agreement Accuracy74.5 | 19 | 4d ago |