Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Navigation on FrozenLake LLM
Loading...
30.5
Success Rate
FreshPER
28.004
28.652
29.3
29.948
Apr 18, 2026
Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate
FreshPER
Type=LLM
2026.04
30.5
On-Policy
Type=LLM
2026.04
29.7
Standard PER
Type=LLM
2026.04
28.1
Feedback
Search any
task
Search any
task