Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reward Modeling Suitability Evaluation on RM Bench Safety-accept

0.698EF

ArmoRM-Llama3-8B-v0.1

-1.07104-0.61177-0.15250.30677Nov 30, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.11
0.698-0.118-0.176-0.8160.5190.357-0.2970.4131.1791.193
2025.11
0.4170.2910.6910.4220.462-0.1410.0040.1431.0470.569
2025.11
0.3590.2280.470.3650.4840.0780.1810.1710.6320.284
2025.11
0.2240.1370.8930.8970.4350.670.180.2911.3081.548
2025.11
0.2030.2560.8710.5780.4960.1080.110.3131.1060.603
2025.11
0.1980.008-0.3010.0450.049-0.1020.0480.1881.123-0.102
2025.11
0.187-0.0130.5771.1110.462-0.278-0.0730.080.3430.221
2025.11
0.160.2770.4210.1740.5640.226-0.1490.2420.5820.54
2025.11
0.149-0.002-0.1320.0160.242-0.0190.0290.1720.3810.296
2025.11
0.126-0.0050.0250.1860.021-0.107-0.0850.0420.061-0.075
2025.11
0.0830.0080.1350.1250.1830.120.1390.1450.2220.328
2025.11
0.071-0.1890.20.3460.2870.1640.1020.150.1740.323
2025.11
0.0650.0440.2470.2920.262-0.043-0.2330.2440.090.215
2025.11
0.0240.140.039-0.0920.312-0.1440.0290.1821.2770.087
2025.11
-0.0540.148-0.0470.1970.266-0.070.0080.2310.080.046
2025.11
-0.09-0.081-0.0350.012-0.0570.0330.004-0.037-0.126-0.042
2025.11
-0.1030.160.2770.2570.3830.075-0.2470.2430.1140.307
2025.11
-0.13-0.0180.2270.2360.125-0.679-0.2260.002-0.233-0.626
2025.11
-0.17-0.1320.0650.1630.245-0.2230.421-0.0940.33-0.171
2025.11
-0.19-0.090.1190.2010.2250.282-0.3240.0330.3350.434
2025.11
-0.426-0.0070.4640.0530.281-0.37-1.2520.5520.9940.317
2025.11
-0.4330.236-0.361-0.6270.021-0.20.423-0.439-0.704-0.296
2025.11
-0.453-0.130.164-0.3620.0790.2840.1350.1830.169-0.214
2025.11
-0.595-0.4320.035-0.2440.003-0.996-0.398-0.459-0.014-0.683
2025.11
-0.599-0.222-0.04-0.0720.03-0.775-0.463-0.4890.096-0.581
2025.11
-1.003-0.5250.191-0.3680.047-0.478-0.297-0.1470.025-0.684