Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sequential Decision Making on HotPotQA

4.7Average Steps per Episode

Teacher (LLaMA-13B)

4.6285.1145.66.086May 20, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
4.7
2025.05
4.8
2025.05
4.8
2025.05
4.9
2025.05
5
2025.05
5.2
2025.05
5.3
2025.05
5.3
2025.05
5.5
2025.05
5.6
2025.05
5.6
2025.05
5.7
2025.05
5.7
2025.05
5.9
2025.05
6
2025.05
6.1
2025.05
6.2
2025.05
6.5