Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Planning Accuracy on Blocksworld (test)
Loading...
97
Accuracy
PT-SFT
-1.8
23.85
49.5
75.15
May 22, 2026
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
PT-SFT
Base model=Mistral
2026.05
97
PT-SFT
Base model=Qwen2.5
2026.05
96
PT-SFT
Base model=GPT-OSS
2026.05
93
HyperGuide
Base model=Qwen2.5
2026.05
87
HyperGuide
Base model=GPT-OSS
2026.05
83
SoftCoT
Base model=Qwen2.5
2026.05
82
OVM
Base model=Qwen2.5
2026.05
81.4
SoftCoT
Base model=GPT-OSS
2026.05
79
OVM
Base model=GPT-OSS
2026.05
77.4
HyperGuide
Base model=Mistral
2026.05
76
SoftCoT
Base model=Mistral
2026.05
71
OVM
Base model=Mistral
2026.05
62
Self-Consistency
Base model=Qwen2.5
2026.05
60
Tree of Thoughts
Base model=Qwen2.5
2026.05
58
Few-shot
Base model=Mistral
2026.05
57
Tree of Thoughts
Base model=Mistral
2026.05
55
Self-Consistency
Base model=Mistral
2026.05
53
Few-shot
Base model=Qwen2.5
2026.05
41
Tree of Thoughts
Base model=GPT-OSS
2026.05
37
Few-shot
Base model=GPT-OSS
2026.05
9
Self-Consistency
Base model=GPT-OSS
2026.05
2
Feedback
Search any
task
Search any
task