MIRROR: Multi-agent Intra- and Inter-Reflection for Optimized Reasoning in Tool Learning

About

Complex tasks involving tool integration pose significant challenges for Large Language Models (LLMs), leading to the emergence of multi-agent workflows as a promising solution. Reflection has emerged as an effective strategy for correcting erroneous trajectories in agentic workflows. However, existing approaches only exploit such capability in the post-action stage, where the agent observes the execution outcomes. We argue that, like humans, LLMs can also engage in reflection before action execution: the agent can anticipate undesirable outcomes from its own decisions, which not only provides a necessarily complementary perspective to evaluate the decision but also prevents the propagation of errors throughout the trajectory. In this paper, we propose MIRROR, a framework that consists of both intra-reflection, which critically assesses intended actions before execution, and inter-reflection, which further adjusts the trajectory based on observations. This design systematically leverages LLM reflection capabilities to eliminate and rectify erroneous actions on a more comprehensive scope. Evaluations on both the StableToolBench and TravelPlanner benchmarks demonstrate MIRROR's superior performance, achieving state-of-the-art results compared to existing approaches.

Zikang Guo, Benfeng Xu, Xiaorui Wang, Zhendong Mao• 2025

Related benchmarks

Task	Dataset	Result	Rank
Travel Planning	TravelPlanner (val)	Delivery Rate100		42
Constraint Satisfaction Plan Generation	TravelPlanner	Delivery Rate100		11

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord