Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Planning on 10x10 Maze
Loading...
57
Validity Rate
L-ICL
-1.24
13.88
29
44.12
Jan 30, 2026
Validity Rate
Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Validity Rate
Success Rate
L-ICL
Number of training exa...
2026.01
57
27
L-ICL
Number of training exa...
2026.01
40
21
L-ICL
Number of training exa...
2026.01
20
16
RAG-ICL
Context size (characte...
2026.01
7
1
RAG-ICL
Context size (characte...
2026.01
7
4
L-ICL
Number of training exa...
2026.01
7
6
ReAct
Prompting strategy=ReA...
2026.01
6
5
ReAct
Inference feedback=Ora...
2026.01
6
5
Zero-Shot
Input representation (...
2026.01
3
0
Self-Consistency
Reasoning samples (k)=...
2026.01
3
3
Self-Refine
Reasoning samples (k)=...
2026.01
3
1
ToT
Prompting strategy=ToT...
2026.01
1
0
Feedback
Search any
task
Search any
task