Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Web-based Reasoning on WebShop
Loading...
34.8
Average Reasoning Length (tokens)
Teacher (LLaMA-13B)
34.428
36.939
39.45
41.961
May 20, 2025
Average Reasoning Length (tokens)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Reasoning Length (tokens)
Teacher (LLaMA-13B)
Distillation Method=No...
2025.05
34.8
Ours (LLaMA-7B)
Distillation Method=St...
2025.05
34.9
Ours (OPT-6.7B)
Distillation Method=St...
2025.05
35.1
Teacher (OPT-13B)
Distillation Method=No...
2025.05
35.9
Ours (OPT-2.7B)
Distillation Method=St...
2025.05
36.2
Token-level-OPT-6.7B
Distillation Method=To...
2025.05
37.2
Token-level-7B
Distillation Method=To...
2025.05
37.5
SeqKD (7B)
Distillation Method=Se...
2025.05
37.8
Ours (OPT-1.3B)
Distillation Method=St...
2025.05
38
SeqKD (6.7B)
Distillation Method=Se...
2025.05
38.2
KD (7B)
Distillation Method=KD...
2025.05
38.9
Token-level-2.7B
Distillation Method=To...
2025.05
39
KD (6.7B)
Distillation Method=KD...
2025.05
39.1
SeqKD (2.7B)
Distillation Method=Se...
2025.05
39.8
KD (2.7B)
Distillation Method=KD...
2025.05
40.7
Token-level-1.3B
Distillation Method=To...
2025.05
41.8
SeqKD (1.3B)
Distillation Method=Se...
2025.05
42.9
KD (1.3B)
Distillation Method=KD...
2025.05
44.1
Feedback
Search any
task
Search any
task