Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Robot Failure Detection on RLBench Fail
Loading...
83
Execution Accuracy
Guardian-8B-Thinking
52.84
60.67
68.5
76.33
Dec 1, 2025
Execution Accuracy
Planning Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Execution Accuracy
Planning Accuracy
Guardian-8B-Thinking
Model Category=Special...
2025.12
83
87
CLIP+MLP
Model Category=Special...
2025.12
65
53
GPT4.1
Model Category=Large-s...
2025.12
63
87
Qwen3-VL-235B-A22B
Model Category=Large-s...
2025.12
59
83
InternVL3-8B
Model Category=Special...
2025.12
59
70
Sentinel
Model Category=Special...
2025.12
57
-
Cosmos-Reason1-7B
Model Category=Special...
2025.12
54
60
Feedback
Search any
task
Search any
task