Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Robotic Planning on LEMMA Single-Agent Underspecified
Loading...
99
Success Rate (SR)
SG-CoT
41.8
56.65
71.5
86.35
Mar 18, 2026
Success Rate (SR)
Completion Quality Ratio (CQR)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate (SR)
Completion Quality Ratio (CQR)
SG-CoT
LLM=Gemini-2.5-Flash
2026.03
99
98
InnerMono
LLM=Gemini-2.5-Flash
2026.03
95
93
SG-CoT
LLM=Gemini-2.5-Flash,...
2026.03
87
54
SG-CoT
LLM=Gemini-2.5-Flash,...
2026.03
82
75
ProgPrompt
LLM=Gemini-2.5-Flash
2026.03
79
77
SG-CoT
LLM=Qwen3-VL-2B
2026.03
77
67
InnerMono
LLM=Qwen3-VL-2B
2026.03
75
62
SG-CoT
LLM=Qwen3-VL-2B, Scene...
2026.03
75
52
CLARA
LLM=Gemini-2.5-Flash
2026.03
70
60
SG-CoT
LLM=Qwen3-VL-2B, Itera...
2026.03
69
62
SG-CoT
LLM=Gemini-2.5-Flash,...
2026.03
68
52
CLARA
LLM=Qwen3-VL-2B
2026.03
63
52
ProgPrompt
LLM=Qwen3-VL-2B
2026.03
58
47
SG-CoT
LLM=Qwen3-VL-2B, Itera...
2026.03
44
28
Feedback
Search any
task
Search any
task