Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BeerQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringBeerQA
Accuracy44.27
14
Expected Calibration ErrorBeerQA
ECE23.28
10
Showing 2 of 2 rows