Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

COSMOS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Next-Step-Prediction Style PlanningCosmos Reason
Performance66.82
16
PlanningCosmos
Score66.36
10
Log-likelihood EstimationCOSMOS 2020 (test)
Mean0
9
Embodied ReasoningCosmos R1
Score81
9
Multi-modal Forgery DetectionCOSMOS
ACC53.78
5
Cross-modal consistency verificationCOSMOS
Accuracy56.85
5
Showing 6 of 6 rows