Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FOL

Benchmarks

Task NameDataset NameSOTA ResultTrend
Horizon GeneralizationFOL Uniform reward
Max LR227.48
6
Full-Information Online LearningFOL Sine-trend rewards Horizon Generalization [T=15 -> T=25] 1.0
Max LR40.62
3
Full-Information Online LearningFOL Gaussian rewards, Horizon Generalization [T=15 -> T=25] 1.0
Max LR57.36
3
Horizon GeneralizationFOL Sine-trend reward
Max LR Value186.81
3
Horizon GeneralizationFOL Gaussian reward
Max LR137.19
3
Showing 5 of 5 rows