Our new X account is live! Follow @wizwand_team for updates

Robot Failure Detection on RLBench Fail

83Execution Accuracy

Guardian-8B-Thinking

Updated 4d ago

Evaluation Results

Method	Links
Guardian-8B-Thinking 2025.12		83	87
CLIP+MLP 2025.12		65	53
GPT4.1 2025.12		63	87
Qwen3-VL-235B-A22B 2025.12		59	83
InternVL3-8B 2025.12		59	70
Sentinel 2025.12		57	-
Cosmos-Reason1-7B 2025.12		54	60