Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on FOLIO to GSM8K
Loading...
95.1
Accuracy
DIN-ICL
80.852
84.551
88.25
91.949
Apr 7, 2026
Accuracy
Updated 10d ago
Evaluation Results
Method
Method
Links
Accuracy
DIN-ICL
Model Series=Gemma-3,...
2026.04
95.1
DIN-ICL
Model Series=Qwen-3, P...
2026.04
95
SET-BSR
Model Series=Qwen-3, P...
2026.04
94.6
DIN-ICL
Model Series=Qwen-3, P...
2026.04
94.6
DIN-ICL
Model Series=Gemma-3,...
2026.04
93.7
DIN-ICL
Model Series=Qwen-2.5,...
2026.04
93.4
SET-BSR
Model Series=Qwen-3, P...
2026.04
93.1
SET-BSR
Model Series=Gemma-3,...
2026.04
93.1
SET-BSR
Model Series=Qwen-2.5,...
2026.04
92.9
SET-BSR
Model Series=Gemma-3,...
2026.04
92.6
DIN-ICL
Model Series=Qwen-2.5,...
2026.04
92.1
SET-BSR
Model Series=Qwen-2.5,...
2026.04
91.6
DIN-ICL
Model Series=Qwen-3, P...
2026.04
91
SET-BSR
Model Series=Qwen-3, P...
2026.04
89.9
DIN-ICL
Model Series=Qwen-2.5,...
2026.04
89.7
SET-BSR
Model Series=Qwen-2.5,...
2026.04
89.6
DIN-ICL
Model Series=LLaMA-3.1...
2026.04
81.5
SET-BSR
Model Series=LLaMA-3.1...
2026.04
81.4
Feedback
Search any
task
Search any
task