| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Excessive HH Harmless 1.0 (Evaluation) | IR3 Method B (Adversarial) | Reference Error Rate8.2 | 10 | 4d ago | |
| Synthetic Goodhart 1.0 (Evaluation) | IR3 Method B (Adversarial) | R_g4.38 | 10 | 4d ago | |
| Length Bias OA Length 1.0 (Evaluation) | IR3 Method C (Constrained) | Dominance15 | 9 | 4d ago |