Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Embodied Task Completion on ALFRED EB
Loading...
32.2
Avg Score
Gemma-3-12B-IT + µOffline Sup.
3.6
11.025
18.45
25.875
Jan 28, 2026
Avg Score
Base Score
Common Score
Complex Score
Spatial Score
Long Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Score
Base Score
Common Score
Complex Score
Spatial Score
Long Score
Gemma-3-12B-IT + µOffline Sup.
Model Backbone=Gemma-3...
2026.01
32.2
48
31
32
23
27
Gemma-3-12B-IT + µSimple
Model Backbone=Gemma-3...
2026.01
28
41
27
34
16
22
Gemma-3-12B-IT + µOnline RL
Model Backbone=Gemma-3...
2026.01
27.8
38
29
41
21
26
Gemma-3-12B-IT
Model Backbone=Gemma-3...
2026.01
25.6
32
26
38
20
12
Qwen2.5-VL-7B-Ins + µOnline RL
Model Backbone=Qwen2.5...
2026.01
14.2
10
13
21
3
24
Qwen2.5-VL-7B-Ins + µOffline Sup.
Model Backbone=Qwen2.5...
2026.01
12.2
9
10
14
2
26
Qwen2.5-VL-7B-Ins + µSimple
Model Backbone=Qwen2.5...
2026.01
9.6
15
10
12
2
14
Qwen2.5-VL-7B-Ins
Model Backbone=Qwen2.5...
2026.01
4.7
10
8
6
0
2
Feedback
Search any
task
Search any
task