| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TVIR-BENCH | TVIR-Agent | Content Score (CS)68.64 | 9 | 1d ago | |
| DeepResearch Bench | PTAH | DLB3.72 | 7 | 5d ago | |
| DeepConsult | PTAH | Instruction Adherence Score13.73 | 7 | 5d ago | |
| WildSeek Text-Centric Complex Queries | CogGen (Ours) | Organization53.89 | 6 | 1mo ago | |
| OWID High-Density Multimodal Reports | Organization49.86 | 6 | 1mo ago |