Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

About

Recently, long-thought reasoning models achieve strong performance on complex reasoning tasks, but often incur substantial inference overhead, making efficiency a critical concern. Our empirical analysis reveals that the benefit of using Long-CoT varies across problems: while some problems require elaborate reasoning, others show no improvement, or even degraded accuracy. This motivates adaptive reasoning strategies that tailor reasoning depth to the input. However, prior work primarily reduces redundancy within long reasoning paths, limiting exploration of more efficient strategies beyond the Long-CoT paradigm. To address this, we propose a novel two-stage framework for adaptive and efficient reasoning. First, we construct a hybrid reasoning model by merging long and short CoT models to enable diverse reasoning styles. Second, we apply bi-level preference training to guide the model to select suitable reasoning styles (group-level), and prefer concise and correct reasoning within each style group (instance-level). Experiments demonstrate that our method (Ada-R1) significantly reduces inference costs compared to other baseline approaches, while maintaining performance. Notably, on five mathematical datasets, the average length of reasoning is reduced by more than 50%, highlighting the potential of adaptive strategies to optimize reasoning efficiency in large language models. Our code is coming soon at https://github.com/StarDewXXX/AdaR1

Haotian Luo, Haiying He, Yibo Wang, Jinluan Yang, Rui Liu, Naiqiang Tan, Xiaochun Cao, Dacheng Tao, Li Shen• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH500 (test)
Accuracy96.4
381
Mathematical ReasoningGSM8K
Accuracy85.1
351
Scientific ReasoningGPQA
Accuracy20
50
Mathematical ReasoningOlympiad Bench
Accuracy40.9
23
Mathematical ReasoningMinerva Math
Accuracy23.5
14
General Reasoning SummaryAggregate (GSM8K, MATH500, Minerva Math, Olympiad Bench, AIME24, AIME25, GPQA)
Accuracy75.5
11
Mathematical ReasoningAIME25
Accuracy68.9
11
Mathematical ReasoningGSM8K (test)
Accuracy95.3
11
Mathematical ReasoningAIME24
Accuracy16.7
11
Mathematical ReasoningAIME25
Accuracy16.7
11
Showing 10 of 12 rows

Other info

Follow for update