Personalization

Benchmarks

Dataset Name	SOTA Method	Metric
CIFAR-100 (test)	FedHB-NIW	Accuracy82.71	80	4mo ago
Multi-Bench (MB)	MIPO	Win Rate94.84	45	4mo ago
PRISM	MIPO	Personalization Win Rate81.62	45	4mo ago
Community Alignment (CA)	MIPO	Personalization Win-Rate93.67	45	4mo ago
Personal	Amulet	Creative Score (ArmoRM)0.998	33	1mo ago
LaMP-2	PRISP	Acc67.9	22	4mo ago
LaMP-3	PersonaAgent	MAE0.241	21	1mo ago
Personalization Dataset	DB + DeGu	CLIP-I Score0.844	19	23d ago
Ultra Chat	Amulet	Creative ArmoRM Score57	18	1mo ago
Truthful QA	Amulet	Creative Score (ArmoRM)56	18	1mo ago
HelpSteer	T-POP	Creative ArmoRM Score0.51	18	1mo ago
GOQA	RPM	Accuracy85.2	14	4mo ago
LaMP-5	RPM	ROUGE-149.2	14	4mo ago
Personalization Prompts SANA	DiT-BlockSkip	DINO Score0.7388	11	4mo ago
Personalization Prompts FLUX	LISA	DINO Score0.7387	11	4mo ago
Synthetic personalized interaction datasets (eval)	PPOpt	Personalization Score7.2	10	4mo ago
LaMP-4	OPPU	ROUGE-121.2	8	4mo ago
LaMP-1	Per-Pcs	Accuracy65.6	8	4mo ago
Real-world (test)	PPOpt	Score7.35	6	4mo ago
Qwen2.5-14B Wealth-Seeking	Sparse with Contrastive Pruning	Wealth-Seeking Score67.5	6	4mo ago
Qwen2.5-14B Power-Seeking	Prompt	Power-Seeking0.445	6	4mo ago
DreamBench-Abs Single-Concept 1.0	Emu2	CP0.73	5	4mo ago
DreamBench-Abs Multi-Concept 1.0	Mod-Adapter	CP0.7	5	4mo ago
PrefEval (test)	EPnG	Accuracy38.33	4	22d ago
ELIX hard (550 users)	FSPO	Winrate91.8	4	3mo ago

Showing 25 of 30 rows