MDPs with Unawareness - Computer Science > Artificial IntelligenceReportar como inadecuado

MDPs with Unawareness - Computer Science > Artificial Intelligence - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: Markov decision processes MDPs are widely used for modeling decision-makingproblems in robotics, automated control, and economics. Traditional MDPs assumethat the decision maker DM knows all states and actions. However, this maynot be true in many situations of interest. We define a new framework, MDPswith unawareness MDPUs to deal with the possibilities that a DM may not beaware of all possible actions. We provide a complete characterization of when aDM can learn to play near-optimally in an MDPU, and give an algorithm thatlearns to play near-optimally when it is possible to do so, as efficiently aspossible. In particular, we characterize when a near-optimal solution can befound in polynomial time.

Autor: Joseph Y. Halpern, Nan Rong, Ashutosh Saxena


Documentos relacionados