Reinforcement Learning and Optimal Control

Reinforcement Learning and Optimal Control
ISBN-10
1886529396
ISBN-13
9781886529397
Category
Computers
Pages
388
Language
English
Published
2019-07-01
Publisher
Athena Scientific
Author
Dimitri Bertsekas

Description

This book considers large and challenging multistage decision problems, which can be solved in principle by dynamic programming (DP), but their exact solution is computationally intractable. We discuss solution methods that rely on approximations to produce suboptimal policies with adequate performance. These methods are collectively known by several essentially equivalent names: reinforcement learning, approximate dynamic programming, neuro-dynamic programming. They have been at the forefront of research for the last 25 years, and they underlie, among others, the recent impressive successes of self-learning in the context of games such as chess and Go. Our subject has benefited greatly from the interplay of ideas from optimal control and from artificial intelligence, as it relates to reinforcement learning and simulation-based neural network methods. One of the aims of the book is to explore the common boundary between these two fields and to form a bridge that is accessible by workers with background in either field. Another aim is to organize coherently the broad mosaic of methods that have proved successful in practice while having a solid theoretical and/or logical foundation. This may help researchers and practitioners to find their way through the maze of competing ideas that constitute the current state of the art. This book relates to several of our other books: Neuro-Dynamic Programming (Athena Scientific, 1996), Dynamic Programming and Optimal Control (4th edition, Athena Scientific, 2017), Abstract Dynamic Programming (2nd edition, Athena Scientific, 2018), and Nonlinear Programming (Athena Scientific, 2016). However, the mathematical style of this book is somewhat different. While we provide a rigorous, albeit short, mathematical account of the theory of finite and infinite horizon dynamic programming, and some fundamental approximation methods, we rely more on intuitive explanations and less on proof-based insights. Moreover, our mathematical requirements are quite modest: calculus, a minimal use of matrix-vector algebra, and elementary probability (mathematically complicated arguments involving laws of large numbers and stochastic convergence are bypassed in favor of intuitive explanations). The book illustrates the methodology with many examples and illustrations, and uses a gradual expository approach, which proceeds along four directions: (a) From exact DP to approximate DP: We first discuss exact DP algorithms, explain why they may be difficult to implement, and then use them as the basis for approximations. (b) From finite horizon to infinite horizon problems: We first discuss finite horizon exact and approximate DP methodologies, which are intuitive and mathematically simple, and then progress to infinite horizon problems. (c) From deterministic to stochastic models: We often discuss separately deterministic and stochastic problems, since deterministic problems are simpler and offer special advantages for some of our methods. (d) From model-based to model-free implementations: We first discuss model-based implementations, and then we identify schemes that can be appropriately modified to work with a simulator. The book is related and supplemented by the companion research monograph Rollout, Policy Iteration, and Distributed Reinforcement Learning (Athena Scientific, 2020), which focuses more closely on several topics related to rollout, approximate policy iteration, multiagent problems, discrete and Bayesian optimization, and distributed computation, which are either discussed in less detail or not covered at all in the present book. The author's website contains class notes, and a series of videolectures and slides from a 2021 course at ASU, which address a selection of topics from both books.

Similar books

  • Reinforcement Learning for Optimal Feedback Control: A Lyapunov-Based Approach
    By Patrick Walters, Rushikesh Kamalapurkar, Joel Rosenfeld

    IEEE Trans Neural NetwLearn Syst 24(1):145–157 Heydari A, Balakrishnan SN (2013) Fixed-final-time optimal control of ... In: Proceedings of the IEEE conference on decision and control, pp 3598–3605 Al'Brekht E (1961) On the optimal ...

  • Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions
    By Warren B. Powell

    This book highlights twelve types of uncertainty that might enter any model and pulls together the diverse set of methods for making decisions, known as policies, into four fundamental classes that span every method suggested in the ...

  • Rollout, Policy Iteration, and Distributed Reinforcement Learning
    By Dimitri Bertsekas

    Reinforcement Learning and Dynamic Programming Using Function Approximators, CRC Press, N. Y. [BBD10b] Busoniu, L., ... “Ellipsoidal Reachable Sets of Linear Time-Varying Continuous and Discrete Systems in Control and Estimation ...

  • Dynamic Programming and Optimal Control
    By Dimitri P. Bertsekas

    Dynamic Programming and Optimal Control

  • sgfrgds
    By Dimitri P. Bertsekas, John N. Tsitsiklis

    This is historically the first book that fully explained the neuro-dynamic programming/reinforcement learning methodology, a breakthrough in the practical application of neural networks and dynamic programming to complex problems of ...

  • Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles
    By Frank L. Lewis, Draguna Vrabie, Kyriakos G. Vamvoudakis

    The book reviews developments in the following fields: optimal adaptive control; online differential games; reinforcement learning principles; and dynamic feedback control systems.

  • Reinforcement Learning, second edition: An Introduction
    By Richard S. Sutton, Andrew G. Barto

    Technical Report CUED/F-INFENG/TR 166. Engineering Department, Cambridge ... Russo, D. J., Van Roy, B., Kazerouni, A., Osband, I., Wen, Z. (2018). ... Saddoris, M. P., Cacciapaglia, F., Wightmman, R. M., Carelli, R. M. (2015).

  • Optimal Control
    By Frank L. Lewis, Draguna Vrabie, Vassilis L. Syrmos

    A NEW EDITION OF THE CLASSIC TEXT ON OPTIMAL CONTROL THEORY As a superb introductory text and an indispensable reference, this new edition of Optimal Control will serve the needs of both the professional engineer and the advanced student in ...

  • Algorithms for Reinforcement Learning
    By Csaba Szepesvári

    This book focuses on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming.

  • Foundations of Deep Reinforcement Learning: Theory and Practice in Python
    By Laura Graesser, Wah Loon Keng

    [125] Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. “Mastering the Game of Go with Deep Neural Networks and Tree Search.