Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

K-armed bandits

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-armed Bandit Regret MinimizationK-armed bandits
Minimax Ratio1
3
Regret minimizationK-armed bandits Exponential Family rewards
Finite-Time Regret (Minimax Ratio)1
2
Regret minimizationK-armed bandits [0, 1] rewards
Finite-Time Regret (Minimax Ratio)1
2
Regret minimizationK-armed bandits Gaussian rewards
Finite-Time Regret (Minimax Ratio)1
1
Regret minimizationK-armed bandits sub-Gaussian rewards
Metric-
0
Showing 5 of 5 rows