Share your thoughts, 1 month free Claude Pro on usSee more

Language-driven scene representation on ALFRED In-Distribution [ID]

84.28F1 Score

Mistral-7B

Updated 2mo ago

Evaluation Results

Method	Links
Mistral-7B 2026.05		84.28	68.58	2.58
Falcon-7B 2026.05		84.2	64.93	2.51
LLaMA 3.1-8B 2026.05		82.16	64.46	3.88
Vicuna-7B 2026.05		81.3	54.47	4.09
Alpaca 2026.05		80.59	45.36	4
Flan-T5 2026.05		78.05	39.31	3.81
T5 2026.05		71.11	30.38	5.04