Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Many-Shot CoT-ICL: Making In-Context Learning Truly Learn

About

While many-shot ICL achieves remarkable performance, prior studies of its scaling behavior have mainly focused on non-reasoning tasks. In this work, we study many-shot ICL on reasoning tasks, with a particular focus on many-shot chain-of-thought in-context learning (CoT-ICL). Analyzing across non-reasoning and reasoning tasks and across non-reasoning and reasoning-oriented LLMs, we identify several distinctive properties of many-shot CoT-ICL. We further interpret these findings by viewing many-shot CoT-ICL as in-context test-time learning rather than scaled pattern matching, and suggest two principles: (i) demonstrations should be easy for the target model to understand, and (ii) they should be ordered to support a smooth conceptual progression. Guided by the principle, we propose Curvilinear Demonstration Selection (CDS), a simple ordering method that yields up to a 5.42 percentage-point gain on a math task with 64 demonstrations. Overall, our results reframe the long context window from a retrieval buffer into a structured curriculum for in-context test-time learning.

Tsz Ting Chung, Lemao Liu, Mo Yu, Dit-Yan Yeung• 2026

Related benchmarks

TaskDatasetResultRank
geometry proof generationGeometry
Accuracy81.21
24
Logical reasoningDetectiveQA
Accuracy (DetectiveQA)88.31
24
number theory problem solvingNumber Theory
Accuracy92.59
24
Showing 3 of 3 rows

Other info

Follow for update