Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Reasoning on CRUX
Loading...
87.37
Accuracy
RMoA
38.75
51.3725
63.995
76.6175
May 30, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
RMoA
Model=GPT-4o
2025.05
87.37
SMoA
Model=GPT-4o
2025.05
86.93
MoA
Model=GPT-4o
2025.05
86.66
GPT-4o
Model=GPT-4o
2025.05
75.8
RMoA
Model=Qwen2.5-7B-Instruct
2025.05
61
SMoA
Model=Qwen2.5-7B-Instruct
2025.05
59.93
Qwen2.5-7B-Instruct
Model=Qwen2.5-7B-Instruct
2025.05
57.31
MoA
Model=Qwen2.5-7B-Instruct
2025.05
56.81
MoA
Model=Gemma2-9B-Instruct
2025.05
51.5
SMoA
Model=Gemma2-9B-Instruct
2025.05
51.25
RMoA
Model=Gemma2-9B-Instruct
2025.05
50.5
Gemma2-9B-Instruct
Model=Gemma2-9B-Instruct
2025.05
47.5
MoA
Model=Llama3.1-8B-Inst...
2025.05
46.12
SMoA
Model=Llama3.1-8B-Inst...
2025.05
44.81
RMoA
Model=Llama3.1-8B-Inst...
2025.05
42.65
Llama3.1-8B-Instruct
Model=Llama3.1-8B-Inst...
2025.05
40.62
Feedback
Search any
task
Search any
task