Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Truthful QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Truthful QATruthful QA
Accuracy68.4
83
Question AnsweringTruthful-QA
Info Accuracy99.2
27
Hallucination DetectionTruthful-QA
Accuracy74.17
17
Question AnsweringTruthful QA
LIS3.1838
10
Showing 4 of 4 rows