| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TruthfulQA | Accuracy70.8 | 103 | 1mo ago | ||
| TruthfulQA | Aligner | Reliability Score16.9 | 33 | 9d ago | |
| TruthfulQA (test) | PromptCD | MC154.95 | 30 | 1mo ago | |
| TruthfulQA medical (test) | BioMistral 7B TIES | Health Score83.6 | 22 | 1mo ago | |
| TruthfulQA | Council Mode | TruthfulQA Score82.6 | 12 | 12d ago | |
| TruthfulQA | IPO | Normalized Accuracy58.76 | 10 | 1mo ago | |
| TruthfulQA | Llama3.1-8B-Instruct | Average Score (@8)68.69 | 8 | 4d ago |