Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coding on HumanEval (Score)
Loading...
96.3
Score
JT-Safe-V2-35B
35.668
51.409
67.15
82.891
Mar 6, 2025
May 18, 2025
Jul 31, 2025
Oct 13, 2025
Dec 26, 2025
Mar 10, 2026
May 23, 2026
Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Score
JT-Safe-V2-35B
Parameters=35B
2026.05
96.3
SOTA with Equivalent Parameters
Model comparison=Equiv...
2026.05
94.5
TinyR1-32B-Preview
2025.03
86.4
Math Expert
2025.03
85.5
Science Expert
2025.03
85.5
Code Expert
2025.03
85.5
Base Model
2025.03
83.5
SkillWeave
#Params=1.42B, Backbon...
2026.05
42.5
FuseChat3.0
#Params=1.48B, Backbon...
2026.05
41.4
Twin-merging
#Params=1.48B, Backbon...
2026.05
39.6
PEFT
#Params=1.42B, Backbon...
2026.05
39.1
Llama3.2-1B-Instruct
#Params=1.15B, Backbon...
2026.05
38.9
self-rewarding
#Params=1.15B, Backbon...
2026.05
38
Feedback
Search any
task
Search any
task