Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Classification on Spaceship Titanic
Loading...
62.55
Leaderboard Percentile
Hierarchical MCTS
-2.502
14.3865
31.275
48.1635
Nov 29, 2025
Leaderboard Percentile
Updated 4d ago
Evaluation Results
Method
Method
Links
Leaderboard Percentile
Hierarchical MCTS
Base LLM=GPT-4o
2025.11
62.55
MCTS-Shaped
Base Model=GPT-4.1-mini
2025.11
62.11
MCTS-Shaped
Base LLM=GPT-4o
2025.11
59.44
ReAct
Base LLM=GPT-4o
2025.11
39.54
MCTS-Outcome
Base LLM=GPT-4o
2025.11
22.88
ReAct
Base Model=GPT-4.1-mini
2025.11
0
LATS
Base Model=GPT-4.1-mini
2025.11
0
MCTS-Outcome
Base Model=GPT-4.1-mini
2025.11
0
Hierarchical MCTS
Base Model=GPT-4.1-mini
2025.11
0
LATS
Base LLM=GPT-4o
2025.11
0
Feedback
Search any
task
Search any
task