Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Real-world benchmark

Benchmarks

Task NameDataset NameSOTA ResultTrend
Video MattingReal-world benchmark
MAD0.19
8
Failure Reasoning and CorrectionReal-World Benchmark (test)
ROUGE-L62.1
7
Unseen Robot AdaptationReal-world Benchmark
Motion Consistency8.47
5
Single Image Super-ResolutionReal-world benchmark (test)
NIQE10.01
4
Robot ManipulationReal-World Benchmark 10 Tasks
Sequential Success Rate75
3
Failure RecoveryReal-World Benchmark
Recovery Rate46
2
Showing 6 of 6 rows