Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Theorem Question Answering on TheoremQA (test)
Loading...
87.4
Accuracy
DS-R1-Distill-Qwen-14B + Meta-Reasoner
37.5424
50.4862
63.43
76.3738
Feb 27, 2025
Accuracy
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
DS-R1-Distill-Qwen-14B + Meta-Reasoner
Backbone=DS-R1-Distill...
2025.02
87.4
Gemini-Exp-1206 + Meta-Reasoner
Backbone=Gemini-Exp-12...
2025.02
86.32
GPT-4o-mini + Meta-Reasoner
Backbone=GPT-4o-mini,...
2025.02
84.13
Qwen3-8B + Meta-Reasoner
Backbone=Qwen3-8B, Rea...
2025.02
82.93
GPT-4 Turbo + MACM
Backbone=GPT-4 Turbo,...
2025.02
79.41
GPT-4o-mini + Reflexion
Backbone=GPT-4o-mini,...
2025.02
74.32
Gemini-Exp-1206 + CoT
Backbone=Gemini-Exp-12...
2025.02
43.12
GPT-4o-mini + CoT
Backbone=GPT-4o-mini,...
2025.02
39.46
Feedback
Search any
task
Search any
task