ConFiQA

Benchmarks

Task Name	Dataset Name	SOTA Result
Multiple Choice Question Answering	ConFiQA MC	F1 Score91.2	42
Open-ended Question Answering	ConFiQA (test)	F1 Score95.7	36
Multi-step Reasoning Question Answering	ConFiQA MR (test)	F1 Score91.3	36
Open-book generation under knowledge conflict	ConFiQA 1,500 subset	Ps Score81.07	32
Context-faithful Question Answering	ConFiQA	MR13.21	24
Faithfulness	ConFiQA	In-Acc81.4	21
Retrieval Following	ConFiQA MC 1.0 (test)	Pc54.9	20
Retrieval Following	ConFiQA MR 1.0 (test)	Pc61.2	20
Retrieval Following	ConFiQA QA 1.0 (test)	Pc92.3	20
Question Answering	ConFiQA	Exact Match (EM)91.5	18
Open-book generation under knowledge conflict	ConFiQA MR 1,500	Ps Score59.8	16
Hallucination Mitigation	ConFiQA	Faith99	12
Question Answering	ConFiQA (out-of-domain)	Hit84.63	12
Question Answering	ConFiQA-MC (Held-out)	Accuracy52.8	8
Question Answering	ConFiQA-QA (Held-out)	Accuracy67	8
Context-faithful Reasoning	ConFiQA MC	Pc38.8	8
Context-faithful Multi-hop Reasoning	ConFiQA MR	Pc45.4	8
Question Answering	ConFiQA-QA counterfactual contexts	Accuracy81.2	7
Question Answering	ConFiQA MR	F1 Score89.6	6
Multiple Choice	ConFiQA MC	Ps Score53.4	4
Machine Reading	ConFiQA MR	Ps Score54.47	4
Question Answering	ConFiQA QA	Ps74.73	4

Showing 22 of 22 rows