Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

IQA-EVAL

Benchmarks

Task NameDataset NameSOTA ResultTrend
Interactive Question AnsweringIQA-EVAL MMLU-derived (TextDavinci)
Helpfulness4.6
8
Interactive Question AnsweringIQA-EVAL MMLU-derived (TextBabbage)
Helpfulness3.87
4
Showing 2 of 2 rows