Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Word Puzzle Solving on Crosswords Word
Loading...
13.5
Accuracy
CoT
-0.02
3.49
7
10.51
Mar 6, 2026
Accuracy
Avg Input Tokens
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Avg Input Tokens
CoT
Backbone=Qwen3-30B
2026.03
13.5
182
RouteGoT
Model Pool={Qwen3-4B,...
2026.03
11
4,222
GoT*
Backbone=Qwen3-30B
2026.03
6.5
13,597
IO
Backbone=Qwen3-30B
2026.03
6
179
EmbedLLM
Model Pool={Qwen3-4B,...
2026.03
5
4,846
AGoT
Backbone=Qwen3-30B
2026.03
3.5
9,522
RouteLLM
Model Pool={Qwen3-4B,...
2026.03
2.5
5,110
ToT
Backbone=Qwen3-30B
2026.03
2.1
790
RTR
Model Pool={Qwen3-4B,...
2026.03
1.5
3,777
Random
Model Pool={Qwen3-4B,...
2026.03
1
8,767
KNN
Model Pool={Qwen3-4B,...
2026.03
0.5
5,044
Feedback
Search any
task
Search any
task