| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Principle-based evaluation dataset | G Score8.68 | 12 | 1mo ago | ||
| Toxicity | Steering Success64 | 11 | 22d ago | ||
| QA | Steering Success62.5 | 11 | 22d ago | ||
| Jailbreak | Steering Success82.5 | 11 | 22d ago | ||
| Emotion | SpotLight | Steering Success92.7 | 11 | 22d ago | |
| AI Persona | InstABoost | Steering Success92.5 | 11 | 22d ago | |
| Human study | Task Tokens | Human-likeness Win Rate93 | 6 | 20d ago |