Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SelfAware

Benchmarks

Task NameDataset NameSOTA ResultTrend
FactualitySelfAware
Score0.372
10
Self-awarenessSelfAware
Accuracy51.2
10
Question AnsweringSelfAware (out-of-domain)
nAUPC9.9
4
Question AnsweringSelfAware
Accuracy27
1
Showing 4 of 4 rows