Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coding on TheoremQA
Loading...
55.38
Accuracy
InfiGFusion
38.22
42.675
47.13
51.585
May 20, 2025
Accuracy
Updated 9d ago
Evaluation Results
Method
Method
Links
Accuracy
InfiGFusion
Model Size=14B, GPU Ho...
2025.05
55.38
InfiFusion
Model Size=14B, GPU Ho...
2025.05
54.62
Pivot-SFT
Model Size=14B, GPU Ho...
2025.05
54.5
FuseLLM
Model Size=14B, GPU Ho...
2025.05
53.52
FuseChat
Model Size=14B, GPU Ho...
2025.05
51.88
Phi-4
Model Size=14B, GPU Ho...
2025.05
51.12
Mistral-Small
Model Size=24B, GPU Ho...
2025.05
48.5
Qwen2.5-Instruct
Model Size=14B, GPU Ho...
2025.05
47.25
MiniLogit
Model Size=14B, GPU Ho...
2025.05
46.36
Qwen2.5-Coder
Model Size=14B, GPU Ho...
2025.05
38.88
Feedback
Search any
task
Search any
task