Gaussian processes for POMDP-based dialogue manager optimizationReportar como inadecuado


Gaussian processes for POMDP-based dialogue manager optimization


Gaussian processes for POMDP-based dialogue manager optimization - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Publication Date: 2013-09-16

Journal Title: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Publisher: IEEE

Volume: 22

Issue: 1

Pages: 28-40

Language: English

Type: Article

Metadata: Show full item record

Citation: Gašić, M., & Young, S. (2013). Gaussian processes for POMDP-based dialogue manager optimization. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22 (1), 28-40.

Description: This is the accepted manuscript version of an article first published in IEEE/ACM Transactions on Audio, Speech, and Language Processing. The final published version is available online from IEEE at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6601004. © 2013 IEEE.

Abstract: A partially observable Markov decision process (POMDP) has been proposed as a dialog model that enables automatic optimization of the dialog policy and provides robustness to speech understanding errors. Various approximations allow such a model to be used for building real-world dialog systems. However, they require a large number of dialogs to train the dialog policy and hence they typically rely on the availability of a user simulator. They also require significant designer effort to hand-craft the policy representation. We investigate the use of Gaussian processes (GPs) in policy modeling to overcome these problems. We show that GP policy optimization can be implemented for a real world POMDP dialog manager, and in particular: 1) we examine different formulations of a GP policy to minimize variability in the learning process; 2) we find that the use of GP increases the learning rate by an order of magnitude thereby allowing learning by direct interaction with human users; and 3) we demonstrate that designer effort can be substantially reduced by basing the policy directly on the full belief space thereby avoiding ad hoc feature space modeling. Overall, the GP approach represents an important step forward towards fully automatic dialog policy optimization in real world systems.

Identifiers:

This record's URL: HTTP://dx.doi.org/10.1109/TASL.2013.2282190http://www.repository.cam.ac.uk/handle/1810/245306





Autor: Gašić, MilicaYoung, Steve

Fuente: https://www.repository.cam.ac.uk/handle/1810/245306



DESCARGAR PDF




Documentos relacionados