MetaCrit: A Critical Thinking Framework for Self-Regulated LLM Reasoning

About

Large language models (LLMs) fail on over one-third of multi-hop questions with counterfactual premises and remain vulnerable to adversarial prompts that trigger biased or factually incorrect responses, which exposes a fundamental deficit in self-regulated reasoning. We propose \textbf{MetaCrit}, a multi-agent framework grounded in Nelson and Narens' metacognitive regulation theory. MetaCrit decomposes reasoning regulation into four agents: object-level generation, a \emph{monitoring} agent that assesses response validity, a \emph{control} agent that critiques logical soundness, and a meta-level synthesizer that integrates all signals into a final response. Evaluation across eight benchmarks, four model backbones, and a college-level analytical writing study shows that MetaCrit significantly improves content truthfulness and logical soundness while eliminating toxic outputs. Its modular design allows individual agents to be integrated into existing frameworks as drop-in components without architectural modifications.

Xinmeng Hou, Ziting Chang, Zhouquan Lu, Chen Wenli, Liang Wan, Wei Feng, Hai Hu, Qing Guo• 2025

Related benchmarks

Task	Dataset	Result
Truthfulness	TruthfulQA	Truthfulness Accuracy97.55	86
Toxicity Evaluation	BoLD	Toxic Rate0.00e+0	26
Logical Coherence	CIAR	Accuracy96	12
Safety Evaluation	HONEST	Score0.00e+0	12
Analytical and personal anecdote writing	User study n=45	Preference Rate (Critical Thinking)41.7	3

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord