Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Modeling on UltraFeedback (test)

0.145MAE

SelectiveRM

0.1365440.1936220.25070.307778Mar 19, 2026Mar 27, 2026Apr 4, 2026Apr 12, 2026Apr 20, 2026Apr 28, 2026May 7, 2026
Updated 26d ago

Evaluation Results

MethodLinks
2026.05
0.1450.1070.461-
2026.03
0.1679-0.47520.3316
2026.03
0.1723-0.43090.3453
2026.05
0.1810.1150.418-
2026.05
0.1810.1140.424-
2026.03
0.1961-0.52070.3169
2026.03
0.2021-0.46680.3342
2026.03
0.2021-0.44030.3424
2026.03
0.2080.1110.47-
2026.03
0.216-0.23050.4015
2026.03
0.2250.1060.495-
2026.03
0.2263-0.45590.3376
2026.03
0.2280.1060.496-
2026.03
0.23-0.43890.3429
2026.05
0.2390.1090.451-
2026.05
0.2410.1120.436-
2026.03
0.2452-0.41410.3504
2026.03
0.2488-0.46520.3347
2026.05
0.2670.1170.409-
2026.05
0.2670.1080.453-
2026.03
0.2713-0.14760.4103
2026.03
0.2720.1130.461-
2026.05
0.2740.1090.449-
2026.05
0.2760.1080.456-
2026.05
0.2830.1120.435-
2026.05
0.2840.110.446-
2026.05
0.290.1110.441-
2026.05
0.2930.1160.416-
2026.05
0.2950.1130.428-
2026.05
0.2950.1130.431-
2026.03
0.2970.1250.405-
2026.05
0.3010.1140.422-
2026.03
0.3049-0.36460.3648
2026.05
0.3150.120.395-
2026.03
0.3212-0.38390.3593
2026.03
0.3492-0.14590.423
2026.03
0.3534-0.32010.3774
2026.03
0.3564-0.25250.3957