| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CIFAR-100 (test) | FedHB-NIW | Accuracy82.71 | 80 | 3mo ago | |
| Multi-Bench (MB) | MIPO | Win Rate94.84 | 45 | 2mo ago | |
| PRISM | MIPO | Personalization Win Rate81.62 | 45 | 2mo ago | |
| Community Alignment (CA) | MIPO | Personalization Win-Rate93.67 | 45 | 2mo ago | |
| Personal | Amulet | Creative Score (ArmoRM)0.998 | 33 | 1d ago | |
| LaMP-2 | PRISP | Acc67.9 | 22 | 3mo ago | |
| LaMP-3 | PersonaAgent | MAE0.241 | 21 | 5d ago | |
| Ultra Chat | Amulet | Creative ArmoRM Score57 | 18 | 1d ago | |
| Truthful QA | Amulet | Creative Score (ArmoRM)56 | 18 | 1d ago | |
| HelpSteer | T-POP | Creative ArmoRM Score0.51 | 18 | 1d ago | |
| GOQA | RPM | Accuracy85.2 | 14 | 3mo ago | |
| LaMP-5 | RPM | ROUGE-149.2 | 14 | 3mo ago | |
| Personalization Prompts SANA | DiT-BlockSkip | DINO Score0.7388 | 11 | 2mo ago | |
| Personalization Prompts FLUX | LISA | DINO Score0.7387 | 11 | 2mo ago | |
| Synthetic personalized interaction datasets (eval) | PPOpt | Personalization Score7.2 | 10 | 3mo ago | |
| LaMP-4 | OPPU | ROUGE-121.2 | 8 | 3mo ago | |
| LaMP-1 | Per-Pcs | Accuracy65.6 | 8 | 3mo ago | |
| Real-world (test) | PPOpt | Score7.35 | 6 | 3mo ago | |
| Qwen2.5-14B Wealth-Seeking | Sparse with Contrastive Pruning | Wealth-Seeking Score67.5 | 6 | 3mo ago | |
| Qwen2.5-14B Power-Seeking | Prompt | Power-Seeking0.445 | 6 | 3mo ago | |
| DreamBench-Abs Single-Concept 1.0 | Emu2 | CP0.73 | 5 | 3mo ago | |
| DreamBench-Abs Multi-Concept 1.0 | Mod-Adapter | CP0.7 | 5 | 3mo ago | |
| ELIX hard (550 users) | FSPO | Winrate91.8 | 4 | 1mo ago | |
| ELIX easy 550 users | FSPO | Winrate97.8 | 4 | 1mo ago | |
| Online Shopping 1.0 (Phase 4) | PAHF (pre+post) | Success Rate0.703 | 4 | 3mo ago |