Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Recursive Think-Answer Process for LLMs and VLMs

About

Think-Answer reasoners such as DeepSeek-R1 have made notable progress by leveraging interpretable internal reasoning. However, despite the frequent presence of self-reflective cues like "Oops!", they remain vulnerable to output errors during single-pass inference. To address this limitation, we propose an efficient Recursive Think-Answer Process (R-TAP) that enables models to engage in iterative reasoning cycles and generate more accurate answers, going beyond conventional single-pass approaches. Central to this approach is a confidence generator that evaluates the certainty of model responses and guides subsequent improvements. By incorporating two complementary rewards-Recursively Confidence Increase Reward and Final Answer Confidence Reward-we show that R-TAP-enhanced models consistently outperform conventional single-pass methods for both large language models (LLMs) and vision-language models (VLMs). Moreover, by analyzing the frequency of "Oops"-like expressions in model responses, we find that R-TAP-applied models exhibit significantly fewer self-reflective patterns, resulting in more stable and faster inference-time reasoning. We hope R-TAP pave the way evolving into efficient and elaborated methods to refine the reasoning processes of future AI.

Byung-Kwan Lee, Youngchae Chee, Yong Man Ro• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical Multimodal ReasoningMathVerse
Accuracy61.8
221
Mathematical Multimodal ReasoningMathVista
Accuracy80.2
218
Mathematical ReasoningMinerva Math
Accuracy43.8
186
Multimodal Math ReasoningMathVision
Accuracy39.9
183
Multimodal Math ReasoningWeMath
Accuracy79.3
168
Mathematical ReasoningAIME 2024 (test)
Accuracy28.3
159
MathematicsMATH 500
Pass@197.3
95
Mathematical ReasoningMATH500
Accuracy83.5
82
Reading ComprehensionDROP
F1 Score84.5
73
Mathematical ReasoningOlympiadBench
Accuracy0.538
72
Showing 10 of 31 rows

Other info

Follow for update