| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Object Instance Segmentation | ARMBench Mixed-Object Tote (test) | mAP5086.37 | 44 | |
| Object Identification | ARMBench (test) | Recall@198 | 10 | |
| Reward Modeling | ARMBench-VL ours (test) | FG Score67.6 | 7 | |
| Failure Detection and Reasoning | ARMBench | Detect Acc.65 | 6 | |
| Post-stow bin state prediction | ARMBench Bin Sweep instance-mask space (test) | N-IoU64.22 | 4 | |
| Post-stow bin state prediction | ARMBench Direct Insert instance-mask space (test) | N-IoU70.21 | 4 | |
| Defect Detection | ARMBench | Multi-Pick Precision84 | 3 | |
| Failure Detection and Reasoning | ARMBench S→A | Detection Accuracy72.5 | 2 | |
| Object Instance Segmentation | ARMBench Same-Object Tote (test) | mAP5015 | 2 | |
| Object Instance Segmentation | ARMBench Zoomed-Out Tote (test) | mAP5057 | 2 |