Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zh.QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-Hop QAZh.QA
F1 Score48.2
8
Multi-Hop Question AnsweringZh.QA
Helmet Correctness Score1.1
8
Showing 2 of 2 rows