Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Maze Solving on Searchformer maze (test)
Loading...
6
Plan Accuracy
Correct Traces
-0.032
1.534
3.1
4.666
May 19, 2025
Plan Accuracy
Trace Validity
Updated 7d ago
Evaluation Results
Method
Method
Links
Plan Accuracy
Trace Validity
Correct Traces
Backbone=Qwen3-8B-base...
2025.05
6
6.6
Swapped Traces
Backbone=Qwen3-8B-base...
2025.05
0.7
0
Base Model
Backbone=Qwen3-8B-base...
2025.05
0.2
0
Solution-only
Backbone=Qwen3-8B-base...
2025.05
0.2
0
Feedback
Search any
task
Search any
task