Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Trustworthiness

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text ClassificationTrustworthiness Plain-text OOD
Accuracy68.44
4
Trustworthiness EvaluationTrustworthiness Average (human evaluation)
Control Win Rate0.88
2
Text ClassificationTrustworthiness Plain-text OOD (test)
Accuracy-
0
Showing 3 of 3 rows