Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WorldArena

Benchmarks

Task NameDataset NameSOTA ResultTrend
World ModelingWorldArena (test)
Image Quality67.36
15
Video GenerationWorldArena
Interaction Quality68.2
14
Embodied World ModelingWorldArena Robotwin
Interaction Quality Score0.682
9
Human evaluation of robot rollout generationWorldArena rollouts
Task Success1.81
8
World ModelingWorldArena
EWMScore59.7
7
Video PerceptionWorldArena
Img Score0.449
5
Embodied Robotics NavigationWorldArena
2D nDTW9.006
4
Showing 7 of 7 rows