Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VALSE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Compositional ReasoningVALSE
Average Score89.2
44
Binary ClassificationVALSE zero-shot
Existence66.9
22
Visual-Semantic ReasoningVALSE
min(pc, pf)52.42
6
Showing 3 of 3 rows