Share your thoughts, 1 month free Claude Pro on usSee more

Language-driven scene representation on ALFRED Object Shift [OS]

83.92F1 Score

Falcon-7B

Updated 2mo ago

Evaluation Results

Method	Links
Falcon-7B 2026.05		83.92	66.38	2.34
Mistral-7B 2026.05		82.72	63.46	2.75
LLaMA 3.1-8B 2026.05		80.44	58.86	4.27
Vicuna-7B 2026.05		80.03	51.67	4.4
Alpaca 2026.05		77.27	43.98	4.64
Flan-T5 2026.05		76.67	47.43	3.48
T5 2026.05		68.17	36.99	4.51