Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UNOBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Path-level reasoningUNOBench synthetic Hard (test)
SR-P56.8
10
Path-level reasoningUNOBench synthetic Medium (test)
SR-P (%)74.8
10
Path-level reasoningUNOBench synthetic Easy (test)
SR (Precision)82.8
10
Path-level reasoningUNOBench synthetic No obstructions (test)
SR (%)94.8
10
Path-level reasoningUNOBench real (Hard)
SR-P79.5
10
Path-level reasoningUNOBench real (Medium)
SR-P (%)76.6
10
Path-level reasoningUNOBench real (Easy)
SR (Precision)76.2
10
Path-level reasoningUNOBench real No obstructions
SR (%)72.5
10
Object-level reasoningUNOBench synthetic (Overall test)
OP79.7
8
Object-level reasoningUNOBench Hard synthetic (test)
OP67.6
8
Object-level reasoningUNOBench Medium synthetic (test)
OP77.8
8
Object-level reasoningUNOBench Easy synthetic (test)
OP81.3
8
Object-level reasoningUNOBench real set (Overall)
OP74.3
8
Object-level reasoningUNOBench real set (Hard)
OP56.4
8
Object-level reasoningUNOBench real set Medium
OP75
8
Object-level reasoningUNOBench real set (Easy)
OP0.757
8
Showing 16 of 16 rows