Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Customer Service on TauBench-Telephony (TBTel)
Loading...
90.5
Accuracy
LLM-guided spec search
-3.62
20.815
45.25
69.685
May 16, 2026
Accuracy
Delta (%)
Updated 15d ago
Evaluation Results
Method
Method
Links
Accuracy
Delta (%)
LLM-guided spec search
Student Model=Qwen3.5-...
2026.05
90.5
15.2
Qwen3.5-9B
Student Model=Qwen3.5-...
2026.05
75.3
-
LLM-guided spec search
Student Model=Gemma4-E...
2026.05
75
13.7
Gemma4-E4B
Student Model=Gemma4-E...
2026.05
61.3
-
LLM-guided spec search
Student Model=Nemotron...
2026.05
11.4
11.4
LLM-guided spec search
Student Model=Qwen3.5-...
2026.05
9.4
9.4
Nemotron-Nano-4B
Student Model=Nemotron...
2026.05
0
-
Qwen3.5-4B
Student Model=Qwen3.5-...
2026.05
0
-
Feedback
Search any
task
Search any
task