Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
STEM Question Answering on TheoremQA
Loading...
76.2
Correctness
MOCHA
52.488
58.644
64.8
70.956
May 19, 2026
Correctness
Updated 14d ago
Evaluation Results
Method
Method
Links
Correctness
MOCHA
Evaluation Backbone=Cl...
2026.05
76.2
ProTeGi
Evaluation Backbone=Cl...
2026.05
69
TextGrad
Evaluation Backbone=Cl...
2026.05
67.2
GEPA
Evaluation Backbone=Cl...
2026.05
65.6
Seed Skill
Evaluation Backbone=Cl...
2026.05
53.4
Feedback
Search any
task
Search any
task