| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Preference Prediction | PRISM (test) | Accuracy66.62 | 51 | |
| Personalized Reward Modeling | PRISM Personalized | Accuracy68.06 | 44 | |
| Preference Alignment | PRISM | Win-Rate (DPO)74.5 | 20 | |
| text-to-image generation | PRISM | Alignment Score87.1 | 14 | |
| LLM as a Judge | PRISM (test) | Accuracy58.9 | 14 | |
| Emotion and Micro-expression Analysis | PRISM | Macro-expression Accuracy80.2 | 13 | |
| Personalized Reward Modeling | PRISM Overall | User-level Accuracy65.3 | 11 | |
| Personalized Reward Modeling | PRISM Unseen | User-level Accuracy0.652 | 11 | |
| Personalized Reward Modeling | PRISM Seen | User-level Accuracy65.3 | 11 | |
| Preference Alignment Evaluation | PRISM (test) | BT Score (Mean)0.331 | 10 | |
| Model Selection Evaluation | PRISM | Actual Score (per type)93.2 | 5 | |
| Preference Alignment | PRISM 1.0 (test) | Borda Average2.393 | 5 | |
| Preference Alignment | PRISM normalized-step (test) | Borda Avg2.328 | 5 | |
| Preference Alignment | PRISM 1.0 (full) | Borda Avg Score2.459 | 5 | |
| Consensus Ranking | PRISM Llama-3.2-1B | Exact Match94 | 1 |