Online Reinforcement Learning for Dynamic Multimedia Systems - Computer Science > LearningReportar como inadecuado




Online Reinforcement Learning for Dynamic Multimedia Systems - Computer Science > Learning - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: In our previous work, we proposed a systematic cross-layer framework fordynamic multimedia systems, which allows each layer to make autonomous andforesighted decisions that maximize the system-s long-term performance, whilemeeting the application-s real-time delay constraints. The proposed solutionsolved the cross-layer optimization offline, under the assumption that themultimedia system-s probabilistic dynamics were known a priori. In practice,however, these dynamics are unknown a priori and therefore must be learnedonline. In this paper, we address this problem by allowing the multimediasystem layers to learn, through repeated interactions with each other, toautonomously optimize the system-s long-term performance at run-time. Wepropose two reinforcement learning algorithms for optimizing the system underdifferent design constraints: the first algorithm solves the cross-layeroptimization in a centralized manner, and the second solves it in adecentralized manner. We analyze both algorithms in terms of their requiredcomputation, memory, and inter-layer communication overheads. After noting thatthe proposed reinforcement learning algorithms learn too slowly, we introduce acomplementary accelerated learning algorithm that exploits partial knowledgeabout the system-s dynamics in order to dramatically improve the system-sperformance. In our experiments, we demonstrate that decentralized learning canperform as well as centralized learning, while enabling the layers to actautonomously. Additionally, we show that existing application-independentreinforcement learning algorithms, and existing myopic learning algorithmsdeployed in multimedia systems, perform significantly worse than our proposedapplication-aware and foresighted learning methods.



Autor: Nicholas Mastronarde, Mihaela van der Schaar

Fuente: https://arxiv.org/







Documentos relacionados