Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Robotic Planning (Seen Tasks)
Loading...
60
Classic Success Rate
SCT-4B
-1.36
14.57
30.5
46.43
Sep 25, 2025
Classic Success Rate
Hard Success Rate
Align Success Rate
Updated 20d ago
Evaluation Results
Method
Method
Links
Classic Success Rate
Hard Success Rate
Align Success Rate
SCT-4B
Parameters=4B
2025.09
60
45
75
Qwen3-8B
Parameters=8B
2025.09
48
28
69
Qwen3-4B
Parameters=4B
2025.09
41
24
42
GPT-4o
2025.09
31
17
54
Mistral-24B
Parameters=24B
2025.09
21
11
71
Gemma3-12B
Parameters=12B
2025.09
9
8
14
Ministral-8B
Parameters=8B
2025.09
3
2
5
Gemma3-4B
Parameters=4B
2025.09
1
1
1
Feedback
Search any
task
Search any
task