| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TONEBANK | K-Steer | Avg Activation Change46 | 27 | 1mo ago | |
| 20 Steering Concepts | Gemma 3 27B Steering (Media Centric) | Granularity1.1785 | 20 | 15d ago | |
| ASR Evaluation Set | PID (ours) | ASR Accuracy94.85 | 20 | 2mo ago | |
| AxBench Gemma-2-2B layer 20 | A-PSRMSE | Steering Score0.871 | 18 | 28d ago | |
| DEBATEMIX | K-Steer | Avg. Activation Change56 | 18 | 1mo ago | |
| AxBench Gemma-2-9B layer 20 | A-PSRMSE | Steering Score1.12 | 17 | 28d ago | |
| Matched-bucket Diagnostic Buckets comparator export protocol | Edit-target ActAdd | Edit Reference Rate86.8 | 12 | 13d ago | |
| Nitro-1-PixArt held-out concepts | Input Fidelity44 | 7 | 22d ago | ||
| DMD2 (held-out concepts) | Input Fidelity46.7 | 7 | 22d ago | ||
| Intentionality | Llama-4 | Steering Effect (Delta s)0.79 | 5 | 22d ago | |
| Audience Awareness | Qwen3-0.6B | Steering Effect (Delta s)0 | 5 | 22d ago | |
| Computational Effort | Qwen3-235B | Steering Effect (Delta s)1.13 | 5 | 22d ago | |
| Risk Assessment | Qwen3-235B | Steering Effect (Delta s)0.49 | 5 | 22d ago | |
| Self-Assessed Capability | Qwen3-235B | Steering Effect (Delta s)44 | 5 | 22d ago | |
| Evaluation Awareness | Qwen3-14B | Steering Effect (Delta s)0.5 | 5 | 22d ago |