Share your thoughts, 1 month free Claude Pro on usSee more

Post-argument rating regression on Anthropic dataset September 2024

0.67MSE

MS-PS-MLP

Updated 5mo ago

Evaluation Results

Method	Links
MS-PS-MLP 2026.01		0.67	0.8185	0.6339	0.811	46.2
Baseline-2-o3 2026.01		0.9882	0.9941	0.731	0.721	37.9
Baseline-1-o3 2026.01		1.0068	1.0034	0.7394	0.716	37.4
Baseline-2-Gemma 2026.01		3.5854	1.8935	1.5347	-0.011	20.1
Baseline-1-Gemma 2026.01		4.3553	2.0869	1.6988	-0.229	18.4