| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | WebQA | CACC46.7 | 40 | |
| Uncertainty Estimation | WebQA | AUROC73.57 | 30 | |
| Multimodal Question Answering | WebQA | F1-Recall90.92 | 22 | |
| Multi-modal Retrieval (Image Query) | WebQA | Recall@2043.55 | 21 | |
| Multi-modal Retrieval (Text Query) | WebQA | Recall@2076.52 | 21 | |
| Multi-modal retrieval (Text to Text/Image-Text) | WebQA | Recall@584.7 | 19 | |
| Poisoned Sample Detection | WebQA (IID) | Recall100 | 16 | |
| Poisoned sample detection | WebQA NIID-1 | Recall99.12 | 16 | |
| Watermark Detection | WebQA | Rank1.05 | 16 | |
| Image-based Question Answering | WebQA | Accuracy53.9 | 14 | |
| Narrative Reasoning | WebQA (test) | BLEURT0.623 | 14 | |
| Visual Question Answering | WebQA image segment 1.0 (test) | Accuracy49.8 | 12 | |
| Multimodal Question Answering | WebQA k=2 | ROrig@k64.8 | 8 | |
| Multimodal Retrieval | WebQA 2 | R@583.15 | 6 | |
| Multimodal Retrieval | WebQA 1 | Recall@595.19 | 6 | |
| Retrieval | WebQA (test) | Recall@574.9 | 5 | |
| Open-domain Question Answering | WebQA (test) | Accuracy53.1 | 5 | |
| Text-to-Text Retrieval (qt -> Ct) | WebQA | Recall@584.7 | 4 | |
| Visual Question Answering | WebQA (val) | FL53 | 4 |