Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Horizon Generalization on FOL Uniform reward
Loading...
227.48
Max LR
Gemma-2-9b-it
31.6168
82.4659
133.315
184.1641
Nov 6, 2025
Max LR
Avg LR
Beta
Updated 2d ago
Evaluation Results
Method
Method
Links
Max LR
Avg LR
Beta
Gemma-2-9b-it
Horizon Shift=T=25 to...
2025.11
227.48
28.47
81
Trained Gemma-2-9b-it
Horizon Shift=T=25 to...
2025.11
122.45
12.05
57
GPT-4o mini
Horizon (T)=25, Action...
2025.11
70.82
22.28
74
FTRL
Horizon Shift=T=25 to...
2025.11
55.04
38.9
75
Trained GPT-4o mini
Horizon (T)=25, Action...
2025.11
39.65
17.09
65
FTRL
Horizon (T)=25, Action...
2025.11
39.15
24.16
75
Feedback
Search any
task
Search any
task