On Bayesian index policies for sequential resource allocation

Emilie Kaufmann

Pré-Publication, Document De Travail Année : 2016

On Bayesian index policies for sequential resource allocation

(1, 2, 3)

1
2
3

Emilie Kaufmann

Fonction : Auteur
PersonId : 10422
IdHAL : emilie-kaufmann
ORCID : 0000-0002-5496-824X
IdRef : 197040810

Sequential Learning

Centre National de la Recherche Scientifique

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Résumé

This paper is about index policies for minimizing (frequentist) regret in a stochastic multi-armed bandit model, that are inspired by a Bayesian view on the problem. Our main contribution is to prove that the Bayes-UCB algorithm, which relies on quantiles of posterior distributions, is asymptotically optimal when the rewards distributions belong to a one-dimensional exponential family, for a large class of prior distributions. We also show that the Bayesian literature gives new insight on what kind of exploration rates could be used in frequentist, UCB-type algorithms. Indeed, approximations of the Bayesian optimal solution or the Finite Horizon Gittins indices provide a justification for the kl-UCB+ and kl-UCB-H+ algorithms, whose asymptotic optimality is also established.

Mots clés

Bayesian algorithms multi-armed bandits

Domaines

Machine Learning [stat.ML]

Fichier principal

BayesianHaL.pdf (605.87 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Emilie Kaufmann : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01251606

Soumis le : lundi 12 septembre 2016-10:14:03

Dernière modification le : vendredi 31 mai 2024-18:32:03

Archivage à long terme le : mardi 13 décembre 2016-12:57:04

Dates et versions

hal-01251606 , version 1 (06-01-2016)

hal-01251606 , version 2 (12-09-2016)

hal-01251606 , version 3 (06-11-2017)

Identifiants

HAL Id : hal-01251606 , version 2
ARXIV : 1601.01190

Citer

Emilie Kaufmann. On Bayesian index policies for sequential resource allocation. 2016. ⟨hal-01251606v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

397 Consultations

258 Téléchargements

On Bayesian index policies for sequential resource allocation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager