Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Sensory-motor control on CartPole
Loading...
42.15
Reward (First Iteration, Worst Rep)
gpt-oss:120b
7.258
16.3165
25.375
34.4335
Jun 5, 2025
Reward (First Iteration, Worst Rep)
Reward (First Iteration, Best Rep)
Reward (First Iteration, Avg)
Reward (Best Iteration, Worst Rep)
Reward (Best Iteration, Best Rep)
Reward (Best Iteration, Avg)
Updated 4d ago
Evaluation Results
Method
Method
Links
Reward (First Iteration, Worst Rep)
Reward (First Iteration, Best Rep)
Reward (First Iteration, Avg)
Reward (Best Iteration, Worst Rep)
Reward (Best Iteration, Best Rep)
Reward (Best Iteration, Avg)
gpt-oss:120b
Temperature=optimal
2025.06
42.15
500
344.32
500
500
500
qwen2.5:72b
Temperature=optimal
2025.06
41.55
44.1
43.05
66.85
500
425.31
llama3.3:70b
Temperature=optimal
2025.06
9.2
172.65
74.45
352.35
500
484.46
deepseek-r1:70b
Temperature=optimal
2025.06
8.7
45.75
12.6
473.75
500
495.78
mistral-large:123b
Temperature=optimal
2025.06
8.6
8.85
8.78
52.9
500
429.5
Feedback
Search any
task
Search any
task