MoreHopQA

Benchmarks

Task Name	Dataset Name	SOTA Result
First-error step detection	MoreHopQA	AUROC0.7885	27
Step-wise Confidence Attribution	MoreHopQA	AUROC0.8084	27
Multi-hop Question Answering	MoreHopQA	Accuracy86.4	25
Multi-hop Retrieval	Morehopqa (test)	Recall80.5	16
Uncertainty Estimation	MoreHopQA Camel	AUROC65.29	16
Multi-hop Question Answering	MorehopQA	AUROC0.6457	16
Uncertainty Estimation	MoreHopQA AutoGen (test)	AUROC63.92	16
Stepwise error detection	MoreHopQA (test)	AUROC0.808	15
Open-ended Question Answering	MoreHopQA (test)	Accuracy77	11
Multi-hop Question Answering	MoreHopQA (test)	Accuracy53.4	9
Multi-hop QA Retrieval	MoreHopQA	NDCG0.908	5
Question Answering	MoreHopQA	Inference Time (s)7.27	4

Showing 12 of 12 rows