RynnBrain: Open Embodied Foundation Models
About
Despite rapid progress in multimodal foundation models, embodied intelligence community still lacks a unified, physically grounded foundation model that integrates perception, reasoning, and planning within real-world spatial-temporal dynamics. We introduce RynnBrain, an open-source spatiotemporal foundation model for embodied intelligence. RynnBrain strengthens four core capabilities in a unified framework: comprehensive egocentric understanding, diverse spatiotemporal localization, physically grounded reasoning, and physics-aware planning. The RynnBrain family comprises three foundation model scales (2B, 8B, and 30B-A3B MoE) and four post-trained variants tailored for downstream embodied tasks (i.e., RynnBrain-Nav, RynnBrain-Plan, and RynnBrain-VLA) or complex spatial reasoning tasks (i.e., RynnBrain-CoP). In terms of extensive evaluations on 20 embodied benchmarks and 8 general vision understanding benchmarks, our RynnBrain foundation models largely outperform existing embodied foundation models by a significant margin. The post-trained model suite further substantiates two key potentials of the RynnBrain foundation model: (i) enabling physically grounded reasoning and planning, and (ii) serving as a strong pretrained backbone that can be efficiently adapted to diverse embodied tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Egocentric daily-task planning | EgoPlanBench2 | Overall Success Rate34.8 | 44 | |
| Long-horizon reasoning for robotic manipulation | RoboVQA | B-1 Score74.3 | 28 | |
| Embodied Planning | Causal-Plan-Bench in-domain | Overall Success Rate37.43 | 16 | |
| Next-Step-Prediction Style Planning | EgoPlan-Bench 2 | Overall Performance Score44.31 | 16 | |
| Next-Step-Prediction Style Planning | RoboVQA | Performance Score60.25 | 16 | |
| Next-Step-Prediction Style Planning | Cosmos Reason | Performance57.84 | 16 |