Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coding on CRUXEval
Loading...
88.5
Pass@1
JT-Safe-V2-35B
37.748
50.924
64.1
77.276
Dec 31, 2025
Jan 23, 2026
Feb 16, 2026
Mar 12, 2026
Apr 5, 2026
Apr 29, 2026
May 23, 2026
Pass@1
Updated 8d ago
Evaluation Results
Method
Method
Links
Pass@1
JT-Safe-V2-35B
Parameters=35B
2026.05
88.5
SOTA with Equivalent Parameters
Model comparison=Equiv...
2026.05
75.7
Youtu-LLM
Size=2B, Type=Base, Pr...
2025.12
55.9
Qwen3
Size=4B, Type=Base, Pr...
2025.12
54.8
Llama3.1
Size=8B, Type=Base, Pr...
2025.12
42.3
SmolLM3
Size=3B, Type=Base, Pr...
2025.12
42.1
Qwen3
Size=1.7B, Type=Base,...
2025.12
40.6
Gemma3
Size=4B, Type=Base, Pr...
2025.12
39.7
Feedback
Search any
task
Search any
task