Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Task Planning on EB-Habitat Long

64Success Rate (SR)

GPT-4o

9.9223.963852.04Aug 2, 2025
Updated 24d ago

Evaluation Results

MethodLinks
2025.08
6472.2
2025.08
6274
2025.08
5863.3
2025.08
5261.2
2025.08
46-
2025.08
40-
2025.08
3052.1
2025.08
2838.2
2025.08
2835
2025.08
2636.2
2025.08
2251
2025.08
2028.2
2025.08
18-
2025.08
1533
2025.08
1421.3
2025.08
1424.3
2025.08
1217.4