Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

EgoPlan-bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-horizon procedural planningEgoPlan-Bench All
Success Rate58.72
13
Long-horizon procedural planningEgoPlan-Bench Out-of-Domain
Success Rate54.37
9
Long-horizon procedural planningEgoPlan-Bench In-Domain
Success Rate62.46
9
Egocentric Action PlanningEgoPlan-bench v2 (test)
Daily life Success Rate64.01
7
Showing 4 of 4 rows