Preference Learning

Benchmarks

Dataset Name	SOTA Method	Metric
Toy dataset 0% label noise (test)	SimPO	Accuracy99.6	76	4mo ago
Toy dataset 50% label noise (test)	SSPO	Accuracy75.7	24	4mo ago
Driving Simulated	Infogain	Alignment0.948	15	4mo ago
Toy dataset Noise 30% (test)	SSPO	Accuracy0.739	12	4mo ago
Toy dataset Noise 10% (test)	SSPO	Accuracy93.1	12	4mo ago
Simulated Matchmaking Environment row-norm sampling, p_flip=0.2, T=6,400	TK	Like Rate80	6	3mo ago
Tennis (test)	C-GPM	Test AUC0.58	4	4mo ago
Pokémon (test)	GPM	Test AUC86	4	4mo ago
Chameleon (test)	GPM	Test AUC92	4	4mo ago
Synthetic (test)	GPM	Test AUC98	4	4mo ago
Robot Voice Design Simulated	Infogain	Alignment0.852	3	4mo ago
Robot Face Design Simulated	Infogain	Alignment96	3	4mo ago
Lunar Lander Simulated	Infogain	Alignment93.3	3	4mo ago
Anthropic HH-RLHF+VI Preference (test)	MC-STL	Overall Accuracy64	3	4mo ago
ML-100K (test)		AUC69.5	2	4mo ago
UCI (test)		AUC56.5	2	4mo ago
3 Grades (test)		AUC53.22	2	4mo ago
LSAT (test)	Spectral algorithm	AUC0.707	2	4mo ago
Website (test)	C-GPM	Test AUC0.66	2	4mo ago

Showing 19 of 19 rows