Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Algebraic Reasoning on AQuA (Performance %)
Loading...
66.36
Performance (%)
ATOM
53.724
57.0045
60.285
63.5655
May 25, 2026
Performance (%)
Updated 7d ago
Evaluation Results
Method
Method
Links
Performance (%)
ATOM
Backbone Model=Meta-Ll...
2026.05
66.36
LLM-Debate
Backbone Model=Meta-Ll...
2026.05
65.89
Random
Backbone Model=Meta-Ll...
2026.05
65.42
AgentPrune
Backbone Model=Meta-Ll...
2026.05
65.42
Star
Backbone Model=Meta-Ll...
2026.05
64.95
G-Designer
Backbone Model=Meta-Ll...
2026.05
64.49
ARG-Designer
Backbone Model=Meta-Ll...
2026.05
63.55
Chain
Backbone Model=Meta-Ll...
2026.05
62.15
Complete
Backbone Model=Meta-Ll...
2026.05
62.15
AgentDropout
Backbone Model=Meta-Ll...
2026.05
61.68
Vanilla
Backbone Model=Meta-Ll...
2026.05
57.01
CoT
Backbone Model=Meta-Ll...
2026.05
54.21
Feedback
Search any
task
Search any
task