Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Arithmetic Reasoning on GSM Reversed
Loading...
90.3
Accuracy
GPT-4o
26.652
43.176
59.7
76.224
Dec 2, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-4o
Prompting=CoT
2025.12
90.3
GPT-4o
Prompting=decl. PL-1S
2025.12
87.8
GPT-4o
Prompting=PY-ZS
2025.12
87.6
GPT-4o
Prompting=decl. PY-1S
2025.12
87.5
GPT-4o
Prompting=PL-ZS
2025.12
76.2
CodeLlama13B
Prompting=decl. PL-1S
2025.12
34.1
CodeLlama13B
Prompting=decl. PY-1S
2025.12
29.1
Feedback
Search any
task
Search any
task