General Question Answering

Benchmarks

Dataset Name	SOTA Method	Metric
TriviaQA	Agentic-R	Exact Match69.02	54	3mo ago
NQ	SkillOrchestra+	Exact Match (EM)54.8	52	3mo ago
PopQA	InForage-instruct	EM45.2	51	3mo ago
General QA NQ, TriviaQA, PopQA	Qwen2.5-32B-Instruct + Proposed Method	NQ Accuracy58.2	40	2mo ago
TriviaQA out-of-domain (val test)	Search-R2	EM70.9	33	1mo ago
NQ (Natural Questions)	Search-R1	EM46.1	32	3mo ago
NQ (Natural Questions) in-domain (val/test)	Search-R2	Exact Match50.9	30	1mo ago
PopQA (test)	Workflow-R1-Search	EM49.3	27	1mo ago
TriviaQA (test)	Workflow-R1-Search	EM73.3	27	1mo ago
TriviaQA (test val)	DeepControl	EM68.2	24	4mo ago
Natural Questions (NQ) (test val)	DeepControl	EM55.8	24	4mo ago
PopQA	DAC	EM47.9	20	1mo ago
TriviaQA	DAC	Exact Match (EM)66.4	20	1mo ago
NQ (Natural Questions)	DAC	Exact Match (EM)49.2	20	1mo ago
TriviaQA		Token-Level F176.4	20	3mo ago
PopQA	SkillOrchestra+	Accuracy48.8	18	4mo ago
TriviaQA	SkillOrchestra+	Accuracy80.2	18	4mo ago
TriviaQA	Search-R1	EM64.4	18	4mo ago
HpQA	ReCal	F1 Score53.4	17	1mo ago
PopQA	ReCal	F1 Score50.5	17	1mo ago
TriviaQA	ReCal	F1 Score78.4	17	1mo ago
NQ	ReCal	F1 Score50.8	17	1mo ago
Natural Questions (NQ) v1.0 (test)	ReCal	Exact Match40.4	17	1mo ago
TriviaQA	π-Play	Score64.6	16	3mo ago
PopQA OOD	Search-R1	Exact Match Accuracy45.7	15	1mo ago

Showing 25 of 43 rows