Characterization and computation of restless bandit marginal productivity indicesReportar como inadecuado




Characterization and computation of restless bandit marginal productivity indices - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Editor: Universidad Carlos III de Madrid. Departamento de Estadística

Issued date: 2007-05

Serie-No.: UC3M Working papers. Statistics and Econometrics07-11

Keywords: Dynamic programming , Semi-Markov , Finite state , Stochastic scheduling , Restless bandits , Priority-index policy , Indexability , Whittle index , Marginal productivity index , Parametric simplex , Block algorithms , Computational complexity

Rights: Atribución-NoComercial-SinDerivadas 3.0 España

Abstract:The Whittle index P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298 yields a practical scheduling rule for the versatile yet intractablemulti-armed restless bandit problem, involving the optimal dynThe Whittle index P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298 yields a practical scheduling rule for the versatile yet intractablemulti-armed restless bandit problem, involving the optimal dynamic priority allocation to multiple stochastic projects, modeled as restless bandits, i.e., binary-action (active-passive) (semi-)Markov decision processes. A growing body of evidence shows that such a rule is nearly optimal in a wide variety of applications, which raises the need to efficiently compute the Whittle index and more general marginal productivity index (MPI) extensions in large-scale models. For such a purpose, this paper extends to restless bandits the parametric linear programming (LP) approachdeployed in J. Niño-Mora. A (2-3)n^3 fast-pivoting algorithm for the Gittins index and optimal stopping of a Markov chain, INFORMS J. Comp., in press, which yielded a fast Gittins-index algorithm. Yet the extension is not straightforward, as the MPI is only defined for the limited range of socalled indexable bandits,which motivates the quest for methods to establish indexability. This paper furnishes algorithmic and analytical tools to realize the potential of MPI policies in largescale applications, presenting the following contributions: (i) a complete algorithmic characterization of indexability, for which two block implementations are given; and(ii) more importantly, new analytical conditions for indexability — termed LP-indexability — that leverage knowledge on the structure ofoptimal policies in particular models, under which the MPI is computed faster by the adaptive-greedy algorithm previously introduced by the author under the more stringent PCL-indexabilityconditions, for which a new fast-pivoting block implementation is given. The paper further reports on a computational study, measuring the runtime performance of the algorithms, and assessing by asimulation study the high prevalence of indexability and PCL-indexability.+-





Autor: Niño-Mora, José

Fuente: http://e-archivo.uc3m.es


Introducción



Universidad Carlos III de Madrid Repositorio institucional e-Archivo http:--e-archivo.uc3m.es Departamento de Estadística DES - Working Papers.
Statistics and Econometrics.
WS 2007-05 Characterization and computation of restless bandit marginal productivity indices Niño-Mora, José http:--hdl.handle.net-10016-796 Descargado de e-Archivo, repositorio institucional de la Universidad Carlos III de Madrid Working Paper 07-43 Statistics and Econometrics Series 11 May 2007 Departamento de Estadística Universidad Carlos III de Madrid Calle Madrid, 126 28903 Getafe (Spain) Fax (34-91) 6249849 Characterization and Computation of Restless Bandit Marginal Productivity Indices∗ José Niño-Mora 1 Abstract The Whittle index [P.
Whittle (1988).
Restless bandits: Activity allocation in a changing world.
J. Appl.
Probab.
25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the optimal dynamic priority allocation to multiple stochastic projects, modeled as restless bandits, i.e., binary-action (active-passive) (semi-) Markov decision processes.
A growing body of evidence shows that such a rule is nearly optimal in a wide variety of applications, which raises the need to efficiently compute the Whittle index and more general marginal productivity index (MPI) extensions in large-scale models.
For such a purpose, this paper extends to restless bandits the parametric linear programming (LP) approach deployed in [J.
Niño-Mora.
A ( 2 - 3)n 3 fast-pivoting algorithm for the Gittins index and optimal stopping of a Markov chain, INFORMS J.
Comp., in press], which yielded a fast Gittins-index algorithm. Yet the extension is not straightforward, as the MPI is only defined for the limited range of socalled indexable bandits, which motivates the quest for methods to establish indexability.
This paper furnishes algorithmic and analytical tools to realize the potential of MPI policies in largescale applica...





Documentos relacionados