Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ViLP

Benchmarks

Task NameDataset NameSOTA ResultTrend
Self-evaluationViLP
AUROC77
36
Multimodal UnderstandingViLP 80% (test)
Accuracy58.06
10
Language Shortcut RobustnessViLP
Accuracy59.3
6
Showing 3 of 3 rows