How to Steal Reasoning Without Reasoning Traces

About

Many large language models (LLMs) use reasoning to generate responses but do not reveal their full reasoning traces (a.k.a. chains of thought), instead outputting only final answers and brief reasoning summaries. To demonstrate that hiding reasoning traces does not prevent users from "stealing" a model's reasoning capabilities, we introduce trace inversion models that, given only the inputs, answers, and (optionally) reasoning summaries exposed by a target model, generate detailed, synthetic reasoning traces. We show that (1) traces synthesized by trace inversion have high overlap with the ground-truth reasoning traces (when available), and (2) fine-tuning student models on inverted traces substantially improves their reasoning and enables distillation from proprietary, black-box LLMs.

Tingwei Zhang, John X. Morris, Vitaly Shmatikov• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	AIME 2024	Accuracy11.1	525
Mathematical Reasoning	MATH 500	Accuracy77.6	442
Mathematical Reasoning	MATH 500	Accuracy72	183
Code Reasoning	LiveCodeBench	Accuracy28.9	102
Math Reasoning	JEEBench	Accuracy44.9	82
Mathematical Reasoning	JEE Math	JEE Math Reasoning Score (s)23.9	13
Downstream Accuracy	MATH500	Accuracy71.8	12
Downstream Accuracy	JEEBench	Accuracy36.3	12
Downstream Accuracy	LiveCodeBench	Accuracy30.9	12
Coding	LiveCodeBench	Accuracy29.1	10

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord