Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BEAR

Benchmarks

Task NameDataset NameSOTA ResultTrend
CalibrationBEAR (test)
Brier Score0.083
96
Multimodal Embodied ReasoningBEAR 1.0 (test)
Task Process Reasoning (PRG)87.5
17
Embodied Skill EvaluationBEAR
Task Plan Success (PRG)87.5
12
Showing 3 of 3 rows