Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Failure Detection and Reasoning benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Failure Detection and Reasoning
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
Sparrow
ARMOR
Detection Accuracy
73.3
8
4d ago
RLBench
ARMOR
Detection Accuracy
91.7
8
4d ago
ARMBench
Claude-3.7
Detect Acc.
65
6
4d ago
Maniskill
SFT-D
Detection Accuracy
78.8
6
4d ago
ARMBench S→A
ARMOR
Detection Accuracy
72.5
2
4d ago
Maniskill R→M
ARMOR
Detection Accuracy
99
2
4d ago
Showing 6 of 6 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Terms of Service
FAQs