Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Reasoning on TheoremQA (test)
Loading...
48.4
Accuracy
GPT-4-Turbo-0409
28.224
33.462
38.7
43.938
May 6, 2024
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-4-Turbo-0409
Few-shot CoT=5-shot
2024.05
48.4
Qwen-1.5-110B
Parameter Size=110B, F...
2024.05
34.9
MAmmoTH2-8x7B-Plus
Parameter Size=8x7B, T...
2024.05
34.1
MAmmoTH2-8B-Plus
Parameter Size=8B, Tra...
2024.05
32.5
MAmmoTH2-8x7B
Parameter Size=8x7B, T...
2024.05
32.2
MAmmoTH2-8B
Parameter Size=8B, Tra...
2024.05
32.2
MAmmoTH2-34B
Parameter Size=34B, Tr...
2024.05
30.4
MAmmoTH2-7B-Plus
Parameter Size=7B, Tra...
2024.05
29.2
MAmmoTH2-7B
Parameter Size=7B, Tra...
2024.05
29
Feedback
Search any
task
Search any
task