Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

S3-Eval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Streaming Spatial ReasoningS3-Eval Sim
Overall Accuracy80.5
20
Embodied Spatial ReasoningS3-Eval (real part)
Overall Score82.1
20
Active Spatial UnderstandingS3-Eval (simulation)
Overall Score62.9
4
Active Vision Spatial UnderstandingS3-Eval real
Overall Score57.8
4
Showing 4 of 4 rows