Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HowMany-QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Open-ended countingHowMany-QA 1.0 (test)
Accuracy60.3
10
Open-ended countingHowMany-QA (test)
Accuracy0.64
6
Showing 2 of 2 rows