Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Robot Failure Analysis (MCQ) on RoboFAC Simulation
Loading...
93
FD Score
KITE+Qwen2.5-7B+QLoRA
35.8
50.65
65.5
80.35
Apr 8, 2026
FD Score
FI Score
FL Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
FD Score
FI Score
FL Score
KITE+Qwen2.5-7B+QLoRA
Finetuned=true
2026.04
93
69
92
RoboFAC-7B
Finetuned=true
2026.04
91
63
94
KITE + Qwen2.5-VL-7B
Finetuned=false
2026.04
88
44
55
GPT-4o
Finetuned=false
2026.04
64
21
71
Qwen2.5-VL-7B
Finetuned=false
2026.04
52
26
22
Gemini-2.0
Finetuned=false
2026.04
48
27
75
Qwen2.5-VL-3B
Finetuned=false
2026.04
38
4
51
Feedback
Search any
task
Search any
task