Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Human-Model Agreement Evaluation on Rebuttal-RM dataset 1.0 (test)

0.839Attitude (Pearson r)

Rebuttal-RM

0.275320.421660.5680.71434Jan 22, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.8390.828910.7530.677790.8210.801820.8390.835810.812
2026.01
0.7430.712800.7390.671750.7790.763740.8040.756680.745
2026.01
0.7180.672620.6090.568710.6220.577690.7180.745720.664
2026.01
0.6990.733710.6870.578740.6970.652770.7710.719750.692
2026.01
0.6460.633790.7080.615760.710.664720.7420.701620.705
2026.01
0.620.509750.6050.593540.6270.607520.7110.705610.616
2026.01
0.5690.635720.7040.67680.7060.686670.7530.738630.68
2026.01
0.420.475460.4670.436730.3690.361700.5610.519570.506
2026.01
0.2970.347540.1580.047380.2720.245560.4240.457460.349