| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Path-level reasoning | UNOBench synthetic Hard (test) | SR-P56.8 | 10 | |
| Path-level reasoning | UNOBench synthetic Medium (test) | SR-P (%)74.8 | 10 | |
| Path-level reasoning | UNOBench synthetic Easy (test) | SR (Precision)82.8 | 10 | |
| Path-level reasoning | UNOBench synthetic No obstructions (test) | SR (%)94.8 | 10 | |
| Path-level reasoning | UNOBench real (Hard) | SR-P79.5 | 10 | |
| Path-level reasoning | UNOBench real (Medium) | SR-P (%)76.6 | 10 | |
| Path-level reasoning | UNOBench real (Easy) | SR (Precision)76.2 | 10 | |
| Path-level reasoning | UNOBench real No obstructions | SR (%)72.5 | 10 | |
| Object-level reasoning | UNOBench synthetic (Overall test) | OP79.7 | 8 | |
| Object-level reasoning | UNOBench Hard synthetic (test) | OP67.6 | 8 | |
| Object-level reasoning | UNOBench Medium synthetic (test) | OP77.8 | 8 | |
| Object-level reasoning | UNOBench Easy synthetic (test) | OP81.3 | 8 | |
| Object-level reasoning | UNOBench real set (Overall) | OP74.3 | 8 | |
| Object-level reasoning | UNOBench real set (Hard) | OP56.4 | 8 | |
| Object-level reasoning | UNOBench real set Medium | OP75 | 8 | |
| Object-level reasoning | UNOBench real set (Easy) | OP0.757 | 8 |