Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

The $\mathbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $\lambda$-Calculus

About

LLMs are increasingly used as general-purpose reasoners, but long inputs remain bottlenecked by a fixed context window. Recursive Language Models (RLMs) address this by externalising the prompt and recursively solving subproblems. Yet existing RLMs depend on an open-ended read-eval-print loop (REPL) in which the model generates arbitrary control code, making execution difficult to verify, predict, and analyse. We introduce $\lambda$-RLM, a framework for long-context reasoning that replaces free-form recursive code generation with a typed functional runtime grounded in $\lambda$-calculus. It executes a compact library of pre-verified combinators and uses neural inference only on bounded leaf subproblems, turning recursive reasoning into a structured functional program with explicit control flow. We show that $\lambda$-RLM admits formal guarantees absent from standard RLMs, including termination, closed-form cost bounds, controlled accuracy scaling with recursion depth, and an optimal partition rule under a simple cost model. Empirically, across four long-context reasoning tasks and nine base models, $\lambda$-RLM outperforms standard RLM in 29 of 36 model-task comparisons, improves average accuracy by up to +21.9 points across model tiers, and reduces latency by up to 4.1x. These results show that typed symbolic control yields a more reliable and efficient foundation for long-context reasoning than open-ended recursive code generation. The complete implementation of $\lambda$-RLM, is open-sourced for the community at: https://github.com/lambda-calculus-LLM/lambda-RLM.

Amartya Roy, Rasul Tutunov, Xiaotong Ji, Matthieu Zimmer, Haitham Bou-Ammar• 2026

Related benchmarks

TaskDatasetResultRank
Long-context ReasoningOOLONG
Accuracy68.4
37
Long-context reasoning (Pairs)OOL-Pairs
Accuracy64.3
27
Semantic Needle-In-A-HaystackS-NIAH
Accuracy51.3
27
Coding Question AnsweringCodeQA
Accuracy55.7
27
Code Question AnsweringCodeQA
Latency (s)42.1
27
Long-context retrievalS-NIAH
Latency (s)28.1
27
Long-context ReasoningOOLONG
Latency (s)38.5
27
Long-context ReasoningOOL-Pairs
Latency (s)30.8
27
Showing 8 of 8 rows

Other info

GitHub

Follow for update