Autofocus Retrieval: An Effective Pipeline for Multi-Hop Question Answering With Semi-Structured Knowledge

About

In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In this work, we present Autofocus-Retriever (AF-Retriever), a modular framework for SKB-based, multi-hop question answering. It combines structural and textual retrieval through novel integration steps and optimizations, achieving the best zero- and one-shot results across all three STaRK QA benchmarks, which span diverse domains and evaluation metrics. AF-Retriever's average first-hit rate surpasses the second-best method by 32.1%. Its performance is driven by (1) leveraging exchangeable large language models (LLMs) to extract entity attributes and relational constraints for both parsing and reranking the top-k answers, (2) vector similarity search for ranking both extracted entities and final answers, (3) a novel incremental scope expansion procedure that prepares for the reranking on a configurable amount of suitable candidates that fulfill the given constraints the most, and (4) a hybrid retrieval strategy that reduces error susceptibility. In summary, while constantly adjusting the focus like an optical autofocus, AF-Retriever delivers a configurable amount of answer candidates in four constraint-driven retrieval steps, which are then supplemented and ranked through four additional processing steps. An ablation study and a detailed error analysis, including a comparison of three different LLM reranking strategies, provide component-level insights. The source code is available at https://github.com/kramerlab/AF-Retriever .

Derian Boer, Stephen Roth, Stefan Kramer• 2025

Related benchmarks

Task	Dataset	Result
Knowledge Graph Retrieval	MAG (test)	H@178.6	33
Knowledge Graph Retrieval	Prime (test)	H@146.2	33
Knowledge Graph Retrieval	Amazon (test)	Hit@161.2	28
Knowledge Graph Retrieval	STARK-PRIME	H@157.1	25
Knowledge Graph Retrieval	STARK AMAZON	H@158	25
Knowledge Graph Retrieval	STaRK-Amazon Synthetic 1.0	Hits@164	20
Knowledge Graph Retrieval	STaRK-MAG Synthetic 1.0	Hits@174.1	20
Knowledge Graph Retrieval	STaRK-Prime Synthetic 1.0	Hits@10.464	20
Retrieval	STaRK MAG Synthetic	Recall@2078.8	20
Retrieval	STaRK PRIME Synthetic	Recall@2065.5	20

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord