| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TruthfulQA | MetaCrit + Claude-3.5-Sonnet | Truthfulness Accuracy97.55 | 86 | 1mo ago | |
| TruthfulQA | AgentRevive | Truthfulness Accuracy72.36 | 51 | 14d ago | |
| TruthfulQA | DPO + MaPPO | TruthfulQA59.2 | 32 | 21d ago | |
| TruthfulQA | SEA | Reward-1.8 | 32 | 2mo ago | |
| TruthfulQA | MADS8B | TruthfulQA Score68.3 | 20 | 1d ago | |
| TruthfulQA (test) | ART-beam | Accuracy46.4 | 20 | 1mo ago | |
| TruthfulQA | AHD | Truthfulness Score41.98 | 16 | 1mo ago | |
| TruthfulQA 4 options (test) | Accuracy (ACC)80.1 | 14 | 20h ago | ||
| TruthfulQA | PROBELLM | MA.81 | 12 | 3mo ago | |
| TruthfulQA | PSFT | Truthfulness Avg.@868.69 | 10 | 1mo ago | |
| TruthfulQA MC2 | Qwen2.5-7B | Truthfulness Score (MC2)56.4 | 5 | 1d ago | |
| TruthfulQA DE | T-Free | Normalized Probability Mass36.2 | 4 | 2mo ago | |
| TruthfulQA | T-Free | Normalized Probability Mass36.4 | 4 | 2mo ago | |
| TruthfulQA DE 6-shot (test) | Llama | Normalized Probability Mass34.2 | 3 | 2mo ago | |
| TruthfulQA DE | Llama-Instruct | Norm. Prob. Mass17.4 | 2 | 2mo ago | |
| TruthfulQA Māori | BYOL-mri (12B-M) | Accuracy49.69 | 2 | 3mo ago |