Unifying Deep Predicate Invention with Pre-trained Foundation Models

About

Long-horizon robotic tasks are hard due to continuous state-action spaces and sparse feedback. Symbolic world models help by decomposing tasks into discrete predicates that capture object properties and relations. Existing methods learn predicates either top-down, by prompting foundation models without data grounding, or bottom-up, from demonstrations without high-level priors. We introduce UniPred, a bilevel learning framework that unifies both. UniPred uses large language models (LLMs) to propose predicate effect distributions that supervise neural predicate learning from low-level data, while learned feedback iteratively refines the LLM hypotheses. Leveraging strong visual foundation model features, UniPred learns robust predicate classifiers in cluttered scenes. We further propose a predicate evaluation method that supports symbolic models beyond STRIPS assumptions. Across five simulated and one real-robot domains, UniPred achieves 2-4 times higher success rates than top-down methods and 3-4 times faster learning than bottom-up approaches, advancing scalable and flexible symbolic world modeling for robotics.

Qianwei Wang, Bowen Li, Zhanpeng Luo, Yifan Xu, Alexander Gray, Tom Silver, Sebastian Scherer, Katia Sycara, Yaqi Xie• 2025

Related benchmarks

Task	Dataset	Result
Task Planning	Satellites SE2 (train)	Success Rate99.6	9
Task Planning	Satellites SE2 (test)	Success Rate95.2	9
Task Planning	Blocks Vec3 distribution (train)	Success Rate100	9
Task Planning	Blocks Vec3 (test)	Success Rate81.6	9
Task Planning	Table Clean Sim SE2 distribution (train)	Success Rate96.4	9
Task Planning	Table Clean Sim SE2 (test)	Success Rate93.4	9
Task Planning	Tools PCD (train)	Success Rate100	8
Task Planning	Tools PCD distribution (test)	Success Rate100	8
Task Planning	Packing Image (train)	Success Rate1	8
Task Planning	Packing Image (test)	Success Rate100	8

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord