Multi-armed Bandit, Dynamic Environments and Meta-BanditsReportar como inadecuado




Multi-armed Bandit, Dynamic Environments and Meta-Bandits - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

1 LRI - Laboratoire de Recherche en Informatique 2 TANC - Algorithmic number theory for cryptology LIX - Laboratoire d-informatique de l-École polytechnique Palaiseau, Inria Saclay - Ile de France, Polytechnique - X, CNRS - Centre National de la Recherche Scientifique : UMR7161

Abstract : This paper presents the Adapt-EvE algorithm, extending the UCBT online learning algorithm Auer et al. 2002 to abruptly changing environments. Adapt-EvE features an adaptive change-point detection test based on Page-Hinkley statistics, and two alternative xtra-exploration procedures respectively based on smooth-restart and Meta-Bandits.

Keywords : multi-armed bandit statistical learning ucb





Autor: Cédric Hartland - Sylvain Gelly - Nicolas Baskiotis - Olivier Teytaud - Michèle Sebag -

Fuente: https://hal.archives-ouvertes.fr/



DESCARGAR PDF




Documentos relacionados