Share your thoughts, 1 month free Claude Pro on usSee more

Sequential Decision Making on HotPotQA

4.7Average Steps per Episode

Teacher (LLaMA-13B)

Updated 4mo ago

Evaluation Results

Method	Links
Teacher (LLaMA-13B) 2025.05		4.7
Teacher (OPT-13B) 2025.05		4.8
Structured Agent Distillation 2025.05		4.8
Structured Agent Distillation 2025.05		4.9
Structured Agent Distillation 2025.05		5
Token-level 2025.05		5.2
Structured Agent Distillation 2025.05		5.3
Token-level 2025.05		5.3
SeqKD 2025.05		5.5
Token-level 2025.05		5.6
SeqKD 2025.05		5.6
KD 2025.05		5.7
KD 2025.05		5.7
SeqKD 2025.05		5.9
Token-level 2025.05		6
KD 2025.05		6.1
SeqKD 2025.05		6.2
KD 2025.05		6.5