Share your thoughts, 1 month free Claude Pro on usSee more

Long-horizon procedural planning on EgoPlan-Bench All

58.72Success Rate

PlanAgent + Mem.

Updated 4mo ago

Evaluation Results

Method	Links
PlanAgent + Mem. 2026.03		58.72
GPT-5.1 2026.03		54.78
PlanAgent + Mem. 2026.03		53.29
Video-LLaMA 2026.03		51.83
Video-LLaMA 2026.03		48.94
PlanAgent + Mem. 2026.03		43.63
GPT-4V 2026.03		37.98
PlanAgent (Ours) 2026.03		35.81
Gemini-Pro-Vision 2026.03		30.46
SEED-LLaMA 2026.03		29.93
Video-LLaMA 2026.03		28.58
Qwen-VL-Chat 2026.03		27.69
DeepSeek-VL-Chat 2026.03		27.57