| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| STL-10 | FastBUS | Accuracy78.48 | 22 | 1mo ago | |
| CIFAR-100 | FastBUS | Accuracy64.45 | 22 | 1mo ago | |
| CIFAR-10 | FastBUS | Accuracy75.39 | 22 | 1mo ago | |
| AlignBench | GPT-4 | Agreement74.69 | 18 | 1mo ago | |
| DeepfakeJudge Meta-Human | Pairwise Accuracy99.4 | 12 | 1mo ago | ||
| DeepfakeJudge Meta | DeepfakeJudge-7B | Pairwise Accuracy96.2 | 12 | 1mo ago | |
| LLMEval | GPT-4 | Agreement0.5098 | 10 | 1mo ago | |
| AUTO-J Eval-P | GPT-4 | Agreement62.28 | 10 | 1mo ago | |
| SummEval (anchor set) | GPT-4o | Accuracy94.5 | 6 | 1mo ago |