| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Stealth Sycophancy Detection | Three-source Sycophancy Benchmark (test) | Spearman Correlation0.9567 | 17 | |
| Sycophancy Detection | Sycophancy benchmark (full evaluation set) | AUROC0.732 | 12 | |
| Vision Language Model Evaluation | Sycophancy Benchmark | Mean Score88.8 | 6 |