Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Arithmetic Reasoning on SVAMP (Accuracy & Latency)
Loading...
69.3
Accuracy
CPO
36.436
44.968
53.5
62.032
Jun 13, 2024
Accuracy
Latency (s)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Latency (s)
CPO
Backbone=Mistral-7B
2024.06
69.3
44.9
ToT
Backbone=Mistral-7B
2024.06
66
4,623.7
CoT
Backbone=Mistral-7B
2024.06
65.3
41.4
TS-SFT
Backbone=Mistral-7B
2024.06
59
41.3
CPO
Backbone=LLAMA2-13B
2024.06
50
48.1
CPO
Backbone=LLAMA2-7B
2024.06
46
32.1
ToT
Backbone=LLAMA2-13B
2024.06
45.7
2,115.3
TS-SFT
Backbone=LLAMA2-13B
2024.06
44.6
46.4
TS-SFT
Backbone=LLAMA2-7B
2024.06
43.1
30.2
ToT
Backbone=LLAMA2-7B
2024.06
42.7
1,861.1
CoT
Backbone=LLAMA2-13B
2024.06
40.3
46.2
CoT
Backbone=LLAMA2-7B
2024.06
37.7
33.3
Feedback
Search any
task
Search any
task