HQ2A

Benchmarks

Task Name	Dataset Name	SOTA Result	Trend
Long-form Question Answering	HQ2A	Comprehensiveness100		3
Sentence-level Error Detection	HQ2A 1.0 (test)	Exact Accuracy25.49		1

Showing 2 of 2 rows