Share your thoughts, 1 month free Claude Pro on usSee more

Language-driven scene representation on ALFRED Template Shift [TS]

84.9F1 Score

Falcon-7B

Updated 2mo ago

Evaluation Results

Method	Links
Falcon-7B 2026.05		84.9	65.07	2.52
Mistral-7B 2026.05		83.97	62.93	2.75
LLaMA 3.1-8B 2026.05		81.87	59.65	4.08
Vicuna-7B 2026.05		81.37	53.39	4
Alpaca 2026.05		80.01	46.92	4.29
Flan-T5 2026.05		79.57	46.32	3.54
T5 2026.05		73.28	35.16	4.45