Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

About

In this study, we investigated the effects of self-reflection in large language models (LLMs) on problem-solving performance. We instructed nine popular LLMs to answer a series of multiple-choice questions to provide a performance baseline. For each incorrectly answered question, we instructed eight types of self-reflecting LLM agents to reflect on their mistakes and provide themselves with guidance to improve problem-solving. Then, using this guidance, each self-reflecting agent attempted to re-answer the same questions. Our results indicate that LLM agents are able to significantly improve their problem-solving performance through self-reflection ($p < 0.001$). In addition, we compared the various types of self-reflection to determine their individual contribution to performance. All code and data are available on GitHub at https://github.com/matthewrenze/self-reflection

Matthew Renze, Erhan Guven• 2024

Related benchmarks

Task	Dataset	Result
Embodied decision-making	AlfWorld	Success Rate66.4	51
Context-aware time-series reasoning	CiK (Context-is-Key)	Average RCRPS0.1608	17
Decision Making	Webshop	Success Rate33.4	15
E-commerce environment navigation	Webshop	ECE (End-State)0.31	7

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord