Structure-Induced Information for Rerooting Levin Tree Search

About

Subgoal-based policy tree search, which uses a policy to guide search, is effective for complex single-agent deterministic problems but often relies on explicit subgoal generation that can incur substantial overhead and hinders scalability. In this paper, we overcome these limitations by using a learned ``rerooter'' through the recently-introduced $\sqrt{\text{LTS}}$ algorithm. A rerooter implicitly decomposes the problem into soft subtasks. While previous work focused on the formal guarantees for given or handcrafted rerooters, in this work we propose three rerooter designs: (i) a clustering-based rerooter that exploits global state-space structure, (ii) a heuristic-based rerooter that leverages learned cost-to-go estimates, and (iii) a hybrid that combines both signals. Our framework avoids having to explicitly reconstruct and reason over generated subgoals, thereby enabling scalable allocation of search effort with significantly lower computational overhead. Empirically, our rerooting-based methods scale to complex environments where subgoal-based policy tree search fails, and achieve state-of-the-art online training efficiency on the domains tested.

Jake Tuero, Michael Buro, Laurent Orseau, Levi H. S. Lelis• 2026

Related benchmarks

Task	Dataset	Result
Combinatorial Search	BoulderDash (train)	Expansions2.42e+7	7
Combinatorial Search	CraftWorld (train)	Search Expansions1.56e+8	7
Combinatorial Search	Sokoban (train)	Search Expansions9.52e+7	7
Combinatorial Search	TSP GridWorld (train)	Search Expansions1.35e+7	7
Search-based planning	BoulderDash hard problems (test)	Solved Rate100	7
Search-based planning	CraftWorld hard problems (test)	Success Rate100	7
Search-based planning	Sokoban Boxoban 1,000 problems (test)	Solved Count1.00e+3	7
Search-based planning	TSP GridWorld modified (test)	Solved Rate100	7

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord