Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Embodied AI Reasoning on ALFWorld
Loading...
100
CoT Match Rate
OPT-13B
54.448
66.274
78.1
89.926
May 20, 2025
CoT Match Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
CoT Match Rate
OPT-13B
Role=Teacher
2025.05
100
LLaMA-13B
Role=Teacher
2025.05
100
Structured Agent Distillation
Teacher Model=LLaMA-13...
2025.05
77.2
Structured Agent Distillation
Teacher Model=OPT-13B,...
2025.05
76.4
Token-level
Teacher Model=LLaMA-13...
2025.05
73
Token-level
Teacher Model=OPT-13B,...
2025.05
72.2
Structured Agent Distillation
Teacher Model=OPT-13B,...
2025.05
71.6
SeqKD
Teacher Model=LLaMA-13...
2025.05
70.1
SeqKD
Teacher Model=OPT-13B,...
2025.05
68.7
KD
Teacher Model=LLaMA-13...
2025.05
68.3
Token-level
Teacher Model=OPT-13B,...
2025.05
67.3
Structured Agent Distillation
Teacher Model=OPT-13B,...
2025.05
67.2
KD
Teacher Model=OPT-13B,...
2025.05
66.4
SeqKD
Teacher Model=OPT-13B,...
2025.05
63.4
Token-level
Teacher Model=OPT-13B,...
2025.05
61.5
KD
Teacher Model=OPT-13B,...
2025.05
61.5
SeqKD
Teacher Model=OPT-13B,...
2025.05
58
KD
Teacher Model=OPT-13B,...
2025.05
56.2
Feedback
Search any
task
Search any
task