Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Modeling on RewardBench (test)

0.933RWBench

J1-Llama-70B

0.744760.793630.84250.89137Jan 28, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.01
0.933---------------
2026.01
0.927---------------
2026.01
0.926---------------
2026.01
0.914---------------
2026.01
0.912---------------
2026.01
0.91---------------
2026.01
0.905---------------
2026.01
0.9---------------
2026.01
0.894---------------
2026.01
0.89---------------
2026.01
0.882---------------
2026.01
0.867---------------
2026.01
0.86---------------
2026.01
0.857---------------
2026.01
0.854---------------
2026.01
0.852---------------
2026.01
0.822---------------
2026.01
0.82---------------
2026.01
0.814---------------
2026.01
0.812---------------
2026.01
0.81---------------
2026.01
0.807---------------
2026.01
0.773---------------
2026.01
0.766---------------
2026.01
0.752---------------
2026.04
-89.9489.3492.7990.6892.289.8894.3385.579410088.96----
2026.04
-88.1490.9989.4792.5992.7789.4993.6287.639210089.13----
2026.04
-90.6288.3593.6887.591.0490.6693.6285.579410088.71----
2026.04
-90.1789.4293.6989.6191.6290.2794.3385.579410089.13----
2026.04
-95.1477.9393.2475.6379.4882.4990.7871.139466.6786.68----
2026.04
-93.4585.1291.5886.5786.7185.9991.4977.329266.6789.67----
2026.04
-93.4585.9591.5887.0487.5786.3891.4978.359266.6790.08----
2026.04
-95.2578.0293.6974.9179.4883.2791.4972.169466.6786.8----
2026.04
-96.7295.797.7593.9197.495.7297.8790.729866.6796.6----
2026.05
------------0.02520.02190.330.13
2026.05
------------0.02490.02250.250.12
2026.05
------------0.02510.0220.320.17
2026.05
------------0.02430.02170.260.15
2026.05
------------0.0240.02270.14-0.01