Share your thoughts, 1 month free Claude Pro on usSee more

Embodied Instruction Following on ALFWorld official (val)

65.3Success Rate

Llama 3.1 405B

Updated 4mo ago

Evaluation Results

Method	Links
Llama 3.1 405B 2025.12		65.3
Qwen 2.5 72B 2025.12		63.5
GPT-OSS 120B 2025.12		60.4
Llama 3.1 70B 2025.12		60.1
GenEnv 2025.12		54.5
GPT-OSS 20B 2025.12		53.6
Qwen 3 32B 2025.12		52.3
Qwen 3 14B 2025.12		37.8
ReSearch 2025.12		18.7
SearchR1 2025.12		16.1
Qwen 2.5 7B 2025.12		14.2
ToRL 2025.12		8