Share your thoughts, 1 month free Claude Pro on usSee more

Horizon Generalization on FOL Uniform reward

227.48Max LR

Gemma-2-9b-it

Updated 1mo ago

Evaluation Results

Method	Links
Gemma-2-9b-it 2025.11		227.48	28.47	81
Trained Gemma-2-9b-it 2025.11		122.45	12.05	57
GPT-4o mini 2025.11		70.82	22.28	74
FTRL 2025.11		55.04	38.9	75
Trained GPT-4o mini 2025.11		39.65	17.09	65
FTRL 2025.11		39.15	24.16	75