Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft

About

The choice of action spaces is a critical yet unresolved challenge in developing capable, end-to-end trainable agents. This paper first presents a large-scale, systematic comparison of prominent abstracted action spaces and tokenizers for Vision-Language-Action (VLA) or hierarchical agent models in the open-ended Minecraft. Our analysis reveals that no single action space is universally optimal; instead, the most effective abstraction is highly task-dependent, creating a dilemma for building generalist agents. To resolve this, we introduce Chain of Action (CoA), a novel framework that unifies high-level planning and low-level control within a single, monolithic VLA model. CoA treats an abstracted action not as a command for a separate policy, but as an intermediate reasoning step--akin to a chain of thought--that guides the generation of the final, executable action. Furthermore, we demonstrate that an All-in-One agent trained on a diverse mixture of action spaces using the CoA paradigm learns a more robust and generalizable policy. This unified agent achieves a new state-of-the-art, improving the overall task success rate over strong, specialized baselines. To foster reproducible research, we release the OpenHA (Open Hierarchical Agents) suite, which includes our comprehensive benchmark of over 800 distinct tasks, curated datasets, source code, and all pretrained model checkpoints at https://github.com/CraftJarvis/OpenHA

Zihao Wang, Muyao Li, Kaichen He, Xiangyu Wang, Zhancun Mu, Anji Liu, Yitao Liang• 2025

Related benchmarks

TaskDatasetResultRank
Combat TasksMCU Mini
Success Rate40
6
Combat TasksMCU All set
Steps316
6
Embodied TasksMCU Mini
SR37
6
Embodied TasksMCU All set
Steps287
6
GUI TasksMCU Mini set
Success Rate3.33e+3
5
GUI TasksMCU All set
Steps314
5
Showing 6 of 6 rows

Other info

Follow for update