Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Science Simulation on Sciworld
Loading...
82.6
Progress Rate
Explicit RM
54.52
61.81
69.1
76.39
Feb 25, 2025
Progress Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Progress Rate
Explicit RM
Inference Strategy=Bea...
2025.02
82.6
Explicit RM
Inference Strategy=Bes...
2025.02
76.1
ImplicitPRM
Inference Strategy=Bes...
2025.02
70.6
Agent-R
2025.02
70.2
gpt-4o
2025.02
66.6
Greedy Search
2025.02
66.6
QLASS
2025.02
66.4
StepAgent
2025.02
64.1
ETO
2025.02
62.5
LLM-as-a-judge
Inference Strategy=Bes...
2025.02
62.3
SPIN
2025.02
60.3
NAT
2025.02
55.6
Feedback
Search any
task
Search any
task