Unifying Deep Predicate Invention with Pre-trained Foundation Models
About
Long-horizon robotic tasks are hard due to continuous state-action spaces and sparse feedback. Symbolic world models help by decomposing tasks into discrete predicates that capture object properties and relations. Existing methods learn predicates either top-down, by prompting foundation models without data grounding, or bottom-up, from demonstrations without high-level priors. We introduce UniPred, a bilevel learning framework that unifies both. UniPred uses large language models (LLMs) to propose predicate effect distributions that supervise neural predicate learning from low-level data, while learned feedback iteratively refines the LLM hypotheses. Leveraging strong visual foundation model features, UniPred learns robust predicate classifiers in cluttered scenes. We further propose a predicate evaluation method that supports symbolic models beyond STRIPS assumptions. Across five simulated and one real-robot domains, UniPred achieves 2-4 times higher success rates than top-down methods and 3-4 times faster learning than bottom-up approaches, advancing scalable and flexible symbolic world modeling for robotics.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Task Planning | Satellites SE2 (train) | Success Rate99.6 | 9 | |
| Task Planning | Satellites SE2 (test) | Success Rate95.2 | 9 | |
| Task Planning | Blocks Vec3 distribution (train) | Success Rate100 | 9 | |
| Task Planning | Blocks Vec3 (test) | Success Rate81.6 | 9 | |
| Task Planning | Table Clean Sim SE2 distribution (train) | Success Rate96.4 | 9 | |
| Task Planning | Table Clean Sim SE2 (test) | Success Rate93.4 | 9 | |
| Task Planning | Tools PCD (train) | Success Rate100 | 8 | |
| Task Planning | Tools PCD distribution (test) | Success Rate100 | 8 | |
| Task Planning | Packing Image (train) | Success Rate1 | 8 | |
| Task Planning | Packing Image (test) | Success Rate100 | 8 |