Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Reasoning on CruxEval Output
Loading...
51
Score
DataFlow-Code-10K
43.616
45.533
47.45
49.367
Dec 18, 2025
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
DataFlow-Code-10K
Base Model=Qwen2.5-14B...
2025.12
51
DataFlow-Code-1K
Base Model=Qwen2.5-14B...
2025.12
50.9
DataFlow-Code-5K
Base Model=Qwen2.5-14B...
2025.12
50.6
Self-OSS
Base Model=Qwen2.5-14B...
2025.12
50.1
Code Alpaca-1K
Base Model=Qwen2.5-14B...
2025.12
49.6
Qwen2.5-14B-Instruct
Base Model=Qwen2.5-14B...
2025.12
48.5
Code Alpaca-1K
Base Model=Qwen2.5-7B-...
2025.12
46.4
Self-OSS
Base Model=Qwen2.5-7B-...
2025.12
45.9
DataFlow-Code-10K
Base Model=Qwen2.5-7B-...
2025.12
45.4
DataFlow-Code-1K
Base Model=Qwen2.5-7B-...
2025.12
45.1
DataFlow-Code-5K
Base Model=Qwen2.5-7B-...
2025.12
45
Qwen2.5-7B-Instruct
Base Model=Qwen2.5-7B-...
2025.12
43.9
Feedback
Search any
task
Search any
task