Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Modeling on RM-Bench (test)

96Overall Score

Skywork-Reward-V2-Llama-3.1-8B-40M

52.73663.96875.286.432Jul 2, 2025Aug 7, 2025Sep 12, 2025Oct 18, 2025Nov 23, 2025Dec 29, 2025Feb 4, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
96--------
92.8--------
2026.02
87.174.995.584.493.684.6---
2026.02
86.775.795.782.293.282.9---
2026.02
86.280.4927795.583.8---
2026.02
85.77689.880.696.283.4---
2026.02
84.968.795.981.393.680.3---
2026.02
83.974.291.874.195.481.4---
2025.07
83.9--------
2026.02
83.672.392.677.891.776.8---
2026.01
83.2--------
2026.02
83.176.790.375.290.272.9---
2026.01
82.8--------
2026.02
82.773.791.47590.680---
2026.01
82.7--------
2025.07
82.7--------
82.6--------
2025.07
82.1--------
2026.01
81.9--------
81.6--------
2026.02
81.571.890.569.594.179.8---
81.1--------
2025.07
80--------
2026.01
79.8--------
2026.01
79.8--------
2026.01
79.1--------
2025.07
79.1--------
78.7--------
2026.01
76.7--------
2026.01
76.3--------
76.3--------
2025.07
75.4--------
2025.07
74.5--------
74.4--------
2026.01
73.4--------
2025.07
73.4--------
2026.01
73.2--------
2026.02
73.166.681.465.279.474.4---
2025.07
73.1--------
2026.02
72.876.774.35189.275.8---
2026.01
72.8--------
2026.02
72.567.267.563.691.766.2---
2026.01
72.5--------
2026.01
72.3--------
2025.07
72.2--------
2025.07
72.1--------
2026.01
71.9--------
2026.01
71.5--------
2025.07
71.3--------
2025.07
71--------
2026.01
70.8--------
2026.01
70.8--------
2026.02
70.571.859.256.694.367.8---
2026.01
70.2--------
2026.02
70.169.560.654.595.766.3---
2026.01
69.5--------
2025.07
69.2--------
2026.01
68--------
2025.07
67.6--------
2025.07
64.7--------
2026.02
6162.562.654.464.462.9---
2026.01
58.7--------
2026.01
54.4--------
2026.02
-66.256.352.188.565.736.370.190.8
2026.02
-63.655.249.779.862.138.96483.2
2026.02
-57.15550.572.758.822.563.890.2
2026.02
-63.85550.890.765.148.766.879.8
2026.02
-57.454.452.382.361.639.763.181.9
2026.02
-68.856.651.98665.842.168.187.4
2026.02
-63.453.949.67660.731.764.386.2
2026.02
-61.354.85283.76343.965.479.5
2026.02
-52.550.149.162.353.526.553.780.3
2026.02
-65.453.85485.864.740.667.386.4
2026.02
-58.453.948.371.6582861.784.5
2026.02
-61.951.949.377.560.131.56385.9
2026.02
-61.551.951.185.662.539.264.284.1
2026.02
-62.153.450.176.260.532.763.785
2026.02
-63.553.150.48462.734.965.887.5
2026.02
-66.153.351.983.563.739.966.784.6
2026.02
-60.45350.96557.32361.787.3
2026.02
-62.853.449.879.661.432.963.887.5
2026.02
-71.758.753.276.5653968.188
2026.02
-74.259.253.482.367.347.669.784.4
2026.02
-70.857.853.686.267.142.169.889.4
2026.02
-69.558.654.188.167.641.371.380.2
2026.02
-68.458.756.981.566.653.968.377.6
2026.02
-66.357.35481.564.852.666.375.5
2026.02
-7055.550.671.86241.463.680.9
2026.02
-69.255.150.159.358.436.159.779.4
2026.02
-64.355.649.463.858.347.458.968.6
2026.02
-63.853.550.472.560.129.463.187.8
2026.02
-62.25852.270.660.839.263.379.8
2026.02
-62.15652.355.556.525.559.984.1
2026.02
-66.855.953.267.860.938.662.981.3
2026.02
-69.455.149.269.360.735.563.183.7
2026.02
-65.356.452.560.558.731.861.882.5
2026.02
-68.454.850.272.761.535.76484.7
2026.02
-64.355.433.651.956.330.356.981.7
2026.02
-62.855.252.151.355.329.75779.4