| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TruthfulQA | MetaCrit + Claude-3.5-Sonnet | Truthfulness Accuracy97.55 | 86 | 4d ago | |
| TruthfulQA | SEA | Reward-1.8 | 32 | 1mo ago | |
| TruthfulQA (test) | ART-beam | Accuracy46.4 | 20 | 9d ago | |
| TruthfulQA | AHD | Truthfulness Score41.98 | 16 | 3d ago | |
| TruthfulQA | PROBELLM | MA.81 | 12 | 1mo ago | |
| TruthfulQA | PSFT | Truthfulness Avg.@868.69 | 10 | 4d ago | |
| TruthfulQA | Cosine | TruthfulQA42 | 8 | 1mo ago | |
| TruthfulQA DE | T-Free | Normalized Probability Mass36.2 | 4 | 1mo ago | |
| TruthfulQA | T-Free | Normalized Probability Mass36.4 | 4 | 1mo ago | |
| TruthfulQA DE 6-shot (test) | Llama | Normalized Probability Mass34.2 | 3 | 1mo ago | |
| TruthfulQA DE | Llama-Instruct | Norm. Prob. Mass17.4 | 2 | 1mo ago | |
| TruthfulQA Māori | BYOL-mri (12B-M) | Accuracy49.69 | 2 | 1mo ago |