Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces

About

Frontier language models have demonstrated strong reasoning and long-horizon tool-use capabilities. However, existing RAG systems fail to leverage these capabilities. They still rely on two paradigms: (1) designing an algorithm that retrieves passages in a single shot and concatenates them into the model's input, or (2) predefining a workflow and prompting the model to execute it step-by-step. Neither paradigm allows the model to participate in retrieval decisions, preventing efficient scaling with model improvements. In this paper, we introduce A-RAG, an Agentic RAG framework that exposes hierarchical retrieval interfaces directly to the model. A-RAG provides three retrieval tools: keyword search, semantic search, and chunk read, enabling the agent to adaptively search and retrieve information across multiple granularities. Experiments on multiple open-domain QA benchmarks show that A-RAG consistently outperforms existing approaches with comparable or lower retrieved tokens, demonstrating that A-RAG effectively leverages model capabilities and dynamically adapts to different RAG tasks. We further systematically study how A-RAG scales with model size and test-time compute. We will release our code and evaluation suite to facilitate future research. Code and evaluation suite are available at https://github.com/Ayanami0730/arag.

Mingxuan Du, Benfeng Xu, Chiwei Zhu, Shaohan Wang, Pengyu Wang, Xiaorui Wang, Zhendong Mao• 2026

Related benchmarks

TaskDatasetResultRank
Long-form Question AnsweringGraphRAG-Bench Med
LLM Accuracy93.1
20
Long-form Question AnsweringNovel GraphRAG-Bench
LLM-Acc85.3
20
Question AnsweringMuSiQue
LLM Accuracy74.1
20
Question AnsweringHotpotQA
LLM Accuracy94.5
20
Question Answering2WikiMultihopQA
LLM-Acc89.7
20
Retrieval EfficiencyMuSiQue
Retrieved Tokens5.64e+4
8
Retrieval EfficiencyHotpotQA
Retrieved Tokens2.74e+3
8
Retrieval Efficiency2WikiMultihopQA
Retrieved Tokens4.54e+4
8
Retrieval EfficiencyMed
Retrieved Tokens2.37e+4
8
Retrieval EfficiencyNovel
Retrieved Tokens2.24e+4
8
Showing 10 of 10 rows

Other info

GitHub

Follow for update