Share your thoughts, 1 month free Claude Pro on usSee more

Planning on 8x8 two-room gridworld (test)

0.89Validity (%)

L-ICL

Updated 1mo ago

Evaluation Results

Method	Links
L-ICL 2026.01		0.89	0.89	0.77
Self-Consistency 2026.01		0.59	0.45	0.43
Self-Refine 2026.01		0.51	0.44	0.38
ReAct 2026.01		0.48	0.41	0.37
PTP 2026.01		0.4	0.33	0.28
RAG-ICL 2026.01		0.21	0.09	0.09
RAG-ICL 2026.01		0.2	0.06	0.06
Zero-Shot 2026.01		0.16	0	0