Tree Ensembles for Contextual Bandits

About

We propose a new framework for contextual multi-armed bandits based on tree ensembles. Our framework adapts two widely used bandit methods, Upper Confidence Bound and Thompson Sampling, for both standard and combinatorial settings. As part of this framework, we propose a novel method of estimating the uncertainty in tree ensemble predictions. We further demonstrate the effectiveness of our framework via several experimental studies, employing XGBoost and random forests, two popular tree ensemble methods. Compared to state-of-the-art methods based on decision trees and neural networks, our methods exhibit superior performance in terms of both regret minimization and computational runtime, when applied to benchmark datasets and the real-world application of navigation over road networks.

Hannes Nilsson, Rikard Johansson, Niklas {\AA}kerblom, Morteza Haghir Chehreghani• 2024

Related benchmarks

Task	Dataset	Result
Contextual Bandit	MagicTelescope OpenML	Final Cumulative Regret1.59e+3	13
Contextual Bandit	OpenML Adult T=10,000	Cumulative Regret (Mean)1.56e+3	7
Contextual Bandit	OpenML Mushroom (T=8124)	Mean Cumulative Regret69.1	7
Contextual Bandit	OpenML Shuttle T=10,000	Cumulative Regret (Mean)170.4	7
Contextual Bandit	OpenML (Adult, Covertype, EEGEyeState, GasDrift, MagicTelescope, Mushroom, PageBlocks, Shuttle)	Adult Score1.61e+3	7
Contextual Bandits	Synthetic Benchmarks T=10,000, R=5	Friedman Score2.50e+3	7
Contextual Bandit	EEGEyeState OpenML	Final Cumulative Regret2.13e+3	6
Contextual Bandit	Adult (OpenML)	Final Cumulative Regret1.56e+3	6
Contextual Bandit	Mushroom (OpenML)	Final Regret69.1	6
Contextual Bandit	Shuttle OpenML	Final Cumulative Regret170.4	6

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord