Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
STEM Theorem Question Answering on TheoremQA
Loading...
4.4
Acceptance Length
TTS
1.072
1.936
2.8
3.664
May 10, 2026
Acceptance Length
Delta (%)
Updated 22d ago
Evaluation Results
Method
Method
Links
Acceptance Length
Delta (%)
TTS
Target Model=Qwen/Qwen...
2026.05
4.4
23.4
TTS
Target Model=Qwen/Qwen...
2026.05
4.3
22.8
TTS
Target Model=Qwen/Qwen...
2026.05
4.2
25.1
DFlash
Target Model=Qwen/Qwen...
2026.05
3.6
-
DFlash
Target Model=Qwen/Qwen...
2026.05
3.5
-
DFlash
Target Model=Qwen/Qwen...
2026.05
3.3
-
TTS
Model=Llama3.1-8B
2026.05
1.9
56.9
TTS
Model=Qwen/Qwen3-8B
2026.05
1.9
16.8
TTS
Model=Qwen/Qwen3-32B
2026.05
1.8
21.9
EAGLE-3
Model=Qwen/Qwen3-8B
2026.05
1.6
-
EAGLE-3
Model=Qwen/Qwen3-32B
2026.05
1.5
-
EAGLE-3
Model=Llama3.1-8B
2026.05
1.2
-
Feedback
Search any
task
Search any
task