Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on Tau2
Loading...
93.6
Accuracy
Qwen-3.5 27B
1.144
25.147
49.15
73.153
Apr 21, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen-3.5 27B
Speedup @32k=0.55x, Sp...
2026.04
93.6
Apriel-1.6
Speedup @32k=1.0x, Spe...
2026.04
62.6
all-attention
Speedup @32k=1.0x, Spe...
2026.04
56.7
Idealized|All–18
Speedup @32k=1.99x, Sp...
2026.04
52.6
Reg|Lklhd–18
Speedup @32k=4.76x, Sp...
2026.04
46.2
Idealized|Lklhd–6
Speedup @32k=6.2x, Spe...
2026.04
40.4
Idealized|All–6
Speedup @32k=6.13x, Sp...
2026.04
34.2
Nemotron-3-Nano 30B
Speedup @32k=4.09x, Sp...
2026.04
31.3
Reg|Lklhd–26
Speedup @32k=2.85x, Sp...
2026.04
30.7
Reg|Lklhd–13
Speedup @32k=6.9x, Spe...
2026.04
28.6
Reg|Lklhd–10
Speedup @32k=10.69x, S...
2026.04
23.4
S1: Distil. Idealized|All–6
Speedup @32k=6.13x, Sp...
2026.04
12.3
Falcon-H1R 7B
Speedup @32k=4.61x, Sp...
2026.04
11.1
OLMo-Hybrid-Think 7B
Speedup @32k=2.51x, Sp...
2026.04
9.7
Nemotron-Nano 12B v2
Speedup @32k=5.85x, Sp...
2026.04
9.1
Apriel-H1 15B
Speedup @32k=1.97x, Sp...
2026.04
4.7
Feedback
Search any
task
Search any
task