Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FACTUAL

Benchmarks

Task NameDataset NameSOTA ResultTrend
Factual Question AnsweringFactual Category Average (test)
Accuracy31.38
18
Scene Graph ParsingFACTUAL (test)
Completeness0.92
5
Showing 2 of 2 rows