Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BIOS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text ClassificationBIOS
Task Accuracy84.6
32
FactualityBIOS
Factuality56
28
Long-form generation factuality and uncertainty estimationBios (test)
FA71.4
14
Factual Precision EvaluationBios
FACTSCORE83
10
ClassificationBios (test)
Accuracy80.1
7
Attribute-conditional generationBIOS
Control Accuracy99.2
5
Showing 6 of 6 rows