Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Behavior

Benchmarks

Task NameDataset NameSOTA ResultTrend
Simulation Task PlanningBEHAVIOR-1K 15 tasks
BT Valid100
14
Progress EstimationBehavior
MRA87.08
12
Long-Horizon Household TasksBEHAVIOR-1K
Fitting44.7
12
Embodied AI PlanningBEHAVIOR-1K
Success Rate100
11
Visual PlanningMINIBEHAVIOR
EM7,580
8
Autonomous DrivingBehavior Shifted Environment (test)
Testing Reward1.02
8
Robot LearningBEHAVIOR 2025 (private)
Binary Success12.4
5
Robot LearningBEHAVIOR 2025 (public)
Binary Success14.4
5
Household PlanningBehavior-1K
Success Rate84.4
5
collecting_childrens_toysBEHAVIOR-1K
Q-Score0.56
4
Pick up Soda CanBEHAVIOR
Navigational Success Rate84
3
Pick up RadioBEHAVIOR
Navigation Success Rate88
3
ADS TestingBehavior
Execution Time (s)43.66
3
Motion PlanningBEHAVIOR Franka MM (test)
Motion Completion Time (sec)5.03
3
Motion PlanningBEHAVIOR HSR 1488 samples (test)
Motion Completion Time (sec)5.01
3
loading_the_carBEHAVIOR-1K
Q-Score0.3
2
moving_boxes_to_storageBEHAVIOR-1K
Q-Score0.8
2
set_up_a_coffee_station_in_your_kitchenBEHAVIOR-1K
Q-Score0.2
2
carrying_in_groceriesBEHAVIOR-1K
Q-Score0
2
storing_foodBEHAVIOR-1K
Q-Score0
2
putting_shoes_on_rackBEHAVIOR-1K
Q-Score0.49
2
hanging_picturesBEHAVIOR-1K
Q-Score0
2
turning_on_radioBEHAVIOR-1K
Q-Score0
2
wash_dog_toysBEHAVIOR-1K
Q-Score0
2
make_pizzaBEHAVIOR-1K
Q-Score0
2
Showing 25 of 62 rows