Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code on HumanEval (Accuracy)

96.34HumanEval Accuracy

Qwen3.5-9B + AR-SFT

34.210450.340266.4782.5998May 26, 2025Jul 26, 2025Sep 26, 2025Nov 27, 2025Jan 28, 2026Mar 31, 2026Jun 1, 2026
Updated 20h ago

Evaluation Results

MethodLinks
2026.06
96.34-
2026.06
95.12-
2026.04
95.1-
2026.04
94.5-
2026.04
93.9-
2026.04
93.9-
2026.01
93.4-
2026.06
93.29-
2026.06
92.68-
2026.01
92.1-
2026.04
92.1-
2026.06
92.07-
2026.01
91.5-
2026.01
91.5-
2026.05
88.41-
2026.05
88.41-
2026.03
88.4-
2026.04
87.8-
2026.06
87.8-
2026.05
86.59-
2026.01
85.4-
2026.01
85.4-
2026.05
85.37-
2026.03
85.3-
2026.04
85.3-
2026.04
84.15-
2026.03
84.1-
2026.03
83.5-
2026.01
81.5-
2026.04
81.1-
2026.04
81.1-
2026.05
81.1-
2026.01
80.5-
2026.05
80.49-
2026.04
79.88-
2026.04
79.8-
2026.04
79.4-
2026.01
79.2-
2026.01
79.19-
2026.01
78.7-
2026.04
78.05-
2026.01
77.85-
2026.05
77.44-
2026.05
77.44-
2026.01
77.18-
2026.04
76.3-
2026.01
76.2-
2026.01
75.4-
2026.01
75-
2026.01
73-
2026.01
72.8-
2026.01
72.6-
2026.01
72.3-
2026.01
72.2-
2026.01
72.1-
2026.01
70.9-
2026.01
70.8-
2026.01
70.7-
2026.04
70.1-
2026.01
69.9-
2026.01
68.9-
2026.03
68.9-
2026.03
68.3-
2026.03
67.1-
2026.03
67.1-
2026.06
67.07-
2026.01
66.8-
2026.01
65.2-
2026.06
64.02-
2026.05
62.2-
2026.05
60.37-
2026.01
59.06-
2026.01
57.72-
2026.01
54.9-
2026.02
54.262,012
2026.02
53.832,282
2026.01
53-
2025.05
53-
2025.05
51.83-
2025.05
50.61-
2025.05
50-
2026.03
49.4-
2026.03
48.2-
2026.03
48.2-
2026.06
48.17-
2025.05
46.95-
2026.02
46.551,802
2026.04
46.34-
2025.05
45.73-
2026.01
45.64-
2026.02
43.422,042
2025.05
41.46-
2025.05
41.46-
2026.05
38.11-
2026.01
37.8-
2026.05
37.65-
2026.05
37.5-
2026.05
37.2-
2026.05
37.2-
2026.03
36.6-
Showing 100 of 118 rows