Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Autofocus Retrieval: An Effective Pipeline for Multi-Hop Question Answering With Semi-Structured Knowledge

About

In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In this work, we present Autofocus-Retriever (AF-Retriever), a modular framework for SKB-based, multi-hop question answering. It combines structural and textual retrieval through novel integration steps and optimizations, achieving the best zero- and one-shot results across all three STaRK QA benchmarks, which span diverse domains and evaluation metrics. AF-Retriever's average first-hit rate surpasses the second-best method by 32.1%. Its performance is driven by (1) leveraging exchangeable large language models (LLMs) to extract entity attributes and relational constraints for both parsing and reranking the top-k answers, (2) vector similarity search for ranking both extracted entities and final answers, (3) a novel incremental scope expansion procedure that prepares for the reranking on a configurable amount of suitable candidates that fulfill the given constraints the most, and (4) a hybrid retrieval strategy that reduces error susceptibility. In summary, while constantly adjusting the focus like an optical autofocus, AF-Retriever delivers a configurable amount of answer candidates in four constraint-driven retrieval steps, which are then supplemented and ranked through four additional processing steps. An ablation study and a detailed error analysis, including a comparison of three different LLM reranking strategies, provide component-level insights. The source code is available at https://github.com/kramerlab/AF-Retriever.

Derian Boer, Stephen Roth, Stefan Kramer• 2025

Related benchmarks

TaskDatasetResultRank
Knowledge Graph RetrievalSTaRK-Amazon Synthetic 1.0
Hits@164
20
Knowledge Graph RetrievalSTaRK-MAG Synthetic 1.0
Hits@174.1
20
Knowledge Graph RetrievalSTaRK-Prime Synthetic 1.0
Hits@10.464
20
RetrievalSTaRK MAG Synthetic
Recall@2078.8
20
RetrievalSTaRK PRIME Synthetic
Recall@2065.5
20
RetrievalSTaRK AMAZON Synthetic
Recall@2056
20
Showing 6 of 6 rows

Other info

Follow for update