Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OverThink: Slowdown Attacks on Reasoning LLMs

About

Most flagship language models generate explicit reasoning chains, enabling inference-time scaling. However, producing these reasoning chains increases token usage (i.e., reasoning tokens), which in turn increases latency and costs. Our OverThink attack increases overhead for applications that rely on reasoning language models (RLMs) and external context by forcing them to spend substantially more reasoning tokens while still producing contextually correct answers. An adversary mounts an attack by injecting decoy reasoning problems into public content that is consumed by RLM at inference time. Because our decoys (e.g., Markov decision processes, Sudokus, etc.) are benign, they evade safety filters. We evaluate OverThink on both closed-source and open-source reasoning models across the FreshQA, SQuAD, and MuSR datasets. We also explore the attack in multi-modal settings by creating images that cause excessive reasoning. We show that the resulting slowdown transfers across models. Finally, we explore both LLM-based and systems-level defenses, and discuss the societal, financial, and energy implications of the OverThink attacks.

Abhinav Kumar, Jaechul Roh, Ali Naseh, Marzena Karpinska, Mohit Iyyer, Amir Houmansadr, Eugene Bagdasarian• 2025

Related benchmarks

TaskDatasetResultRank
Reasoning length evaluation20 attack prompts
Avg Length3.94e+3
48
Reasoning Token InductionMixed Prompts (SimpleQA, SimpleBench, AIME2024, etc.) (test)
Mean Completion Tokens7.93e+3
31
Throughput EfficiencyToolBench
Throughput (tokens/s)4.55e+3
18
Throughput EfficiencyBFCL
Throughput4.56e+3
18
Function CallingBFCL
Energy (Wh)7.6
18
Tool UseToolBench
Energy (Wh)11.9
18
Showing 6 of 6 rows

Other info

Follow for update