Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Structure-Induced Information for Rerooting Levin Tree Search

About

Subgoal-based policy tree search, which uses a policy to guide search, is effective for complex single-agent deterministic problems but often relies on explicit subgoal generation that can incur substantial overhead and hinders scalability. In this paper, we overcome these limitations by using a learned ``rerooter'' through the recently-introduced $\sqrt{\text{LTS}}$ algorithm. A rerooter implicitly decomposes the problem into soft subtasks. While previous work focused on the formal guarantees for given or handcrafted rerooters, in this work we propose three rerooter designs: (i) a clustering-based rerooter that exploits global state-space structure, (ii) a heuristic-based rerooter that leverages learned cost-to-go estimates, and (iii) a hybrid that combines both signals. Our framework avoids having to explicitly reconstruct and reason over generated subgoals, thereby enabling scalable allocation of search effort with significantly lower computational overhead. Empirically, our rerooting-based methods scale to complex environments where subgoal-based policy tree search fails, and achieve state-of-the-art online training efficiency on the domains tested.

Jake Tuero, Michael Buro, Laurent Orseau, Levi H. S. Lelis• 2026

Related benchmarks

TaskDatasetResultRank
Combinatorial SearchBoulderDash (train)
Expansions2.42e+7
7
Combinatorial SearchCraftWorld (train)
Search Expansions1.56e+8
7
Combinatorial SearchSokoban (train)
Search Expansions9.52e+7
7
Combinatorial SearchTSP GridWorld (train)
Search Expansions1.35e+7
7
Search-based planningBoulderDash hard problems (test)
Solved Rate100
7
Search-based planningCraftWorld hard problems (test)
Success Rate100
7
Search-based planningSokoban Boxoban 1,000 problems (test)
Solved Count1.00e+3
7
Search-based planningTSP GridWorld modified (test)
Solved Rate100
7
Showing 8 of 8 rows

Other info

Follow for update