Share your thoughts, 1 month free Claude Pro on usSee more

Horizon Generalization on FOL Gaussian reward

137.19Max LR

Gemma-2-9b-it

Updated 1mo ago

Evaluation Results

Method	Links
Gemma-2-9b-it 2025.11		137.19	20.62	0.87
Trained Gemma-2-9b-it 2025.11		93.08	20.45	0.8
FTRL 2025.11		49.7	27.99	0.81