| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| LLM Steering | Persona | Average Trait Score81.75 | 18 | |
| Watch duration prediction | Persona B | SMAPE0.86 | 4 | |
| Watch duration prediction | Persona A | SMAPE0.617 | 4 | |
| Stylized Dialogue | 148-query style-1 persona (test) | Context Score4.257 | 3 | |
| Dialogue Quality Evaluation | Persona High Info | BF1 (qt, at)0.62 | 1 | |
| Dialogue Quality Evaluation | Persona Med. Info | BF1 (qt, at)61 | 1 | |
| Dialogue Quality Evaluation | Persona Low Info | BF1 (qt, at)61 | 1 | |
| Dialogue Quality Evaluation | Persona Sing. Inst. | BF1 (qt, at)0.58 | 1 |