| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| DVDs | LIME | Average F196.6 | 16 | 4d ago | |
| Books | LIME | Avg F196.7 | 16 | 4d ago | |
| Trust-Memevo Tool-use Domain | TAME | No-Memory81.8 | 14 | 4d ago | |
| Trust-Memevo Math Domain | Reasoningbank+Guard | No-Memory Score36.7 | 14 | 4d ago | |
| Trust-Memevo Science Domain | Reasoningbank | No-Memory81.3 | 14 | 4d ago | |
| AraTrust | LLaMA3-Tamed-70B | Accuracy63.41 | 8 | 4d ago | |
| LLM Trustworthiness Benchmark | Mi:dm 2.0 Base-inst | Bias Score80.77 | 5 | 4d ago | |
| Trustworthiness Average (human evaluation) | Sparse Activation Control | Control Win Rate0.88 | 2 | 4d ago | |
| Adv Fact (human evaluation) | Sparse Activation Control | Control Wins68 | 1 | 4d ago | |
| Privacy (human evaluation) | Sparse Activation Control | Control Wins100 | 1 | 4d ago | |
| Robust (human evaluation) | Sparse Activation Control | Control Wins100 | 1 | 4d ago | |
| Exag safety (human evaluation) | Sparse Activation Control | Control Wins68 | 1 | 4d ago |