Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Modeling on PKU-SafeRLHF (test)

0.074MAE

SelectiveRM

0.06450.1286250.192750.256875Mar 19, 2026Mar 27, 2026Apr 4, 2026Apr 12, 2026Apr 20, 2026Apr 28, 2026May 7, 2026
Updated 26d ago

Evaluation Results

MethodLinks
2026.05
0.0740.0570.772-
2026.05
0.0780.0670.731-
2026.03
0.0871-0.71040.2681
2026.05
0.10.0730.707-
2026.03
0.1053-0.78720.2294
2026.05
0.1130.0830.668-
2026.03
0.1190.0570.77-
2026.03
0.1251-0.6280.3039
2026.03
0.1279-0.64710.296
2026.03
0.1282-0.67420.2844
2026.03
0.1290.0550.779-
2026.03
0.129-0.60430.3134
2026.03
0.1422-0.73460.2567
2026.03
0.1423-0.63890.2994
2026.03
0.1465-0.7540.2472
2026.05
0.1550.0650.738-
2026.03
0.1710.070.72-
2026.03
0.1771-0.69480.2753
2026.03
0.1871-0.50110.352
2026.03
0.1899-0.70930.2687
2026.05
0.190.0620.752-
2026.03
0.1959-0.68760.2785
2026.05
0.1990.0640.741-
2026.05
0.2050.0670.731-
2026.05
0.2160.0670.729-
2026.05
0.2230.0720.712-
2026.05
0.2360.0740.703-
2026.05
0.240.0740.703-
2026.03
0.2423-0.59140.2718
2026.05
0.2430.0810.673-
2026.05
0.2450.0820.671-
2026.05
0.2530.0790.683-
2026.05
0.2550.0790.682-
2026.05
0.2730.0890.642-
2026.03
0.2887-0.55350.333
2026.03
0.3115-0.52280.3442