#### Block Course, 04.10.2022 - 07.10.2022, 9:00-18:00, HS 1098, Kollegiengebäude I, Platz der Universität 3, 79098 Freiburg i.Br.

#### Lecture: Prof. Dr. Joschka Boedecker and Prof. Dr. Moritz Diehl

#### Exercises: Jasper Hoffmann and Florian Messerer

This block course is intended for master and PhD students from engineering, computer science, mathematics, physics, and other mathematical sciences. The aim is that participants understand the main concepts of model predictive control (MPC) and reinforcement learning (RL) as well the similarities and differences between the two approaches. In hands-on exercises and project work they learn to apply the methods to practical optimal control problems from science and engineering.

The course consists of lectures, exercises, and project work. The lectures in are given by Prof. Dr. Joschka Boedecker and Prof. Dr. Moritz Diehl from the University of Freiburg.

Topics include

- Optimal Control Problem (OCP) formulations - constrained, infinite horizon, discrete time, stochastic, robust
- Markov Decision Processes (MDP)
- From continuous to discrete: discretization in space and time
- Dynamic Programming (DP) concepts and algorithms - value iteration and policy iteration
- Linear Quadratic Regulator (LQR) and Riccati equations
- Convexity considerations in DP for constrained linear systems
- Model predictive control (MPC) formulations and stability guarantees
- MPC algorithms - quadratic programming, direct multiple shooting, Gauss-Newton, real-time iterations
- Differential Dynamic Programming (DDP) for the solution of unconstrained MPC problems
- Reinforcement Learning (RL) formulations and approaches
- Model-free RL: Monte Carlo, temporal differences, model learning, direct policy search
- RL with function approximation
- Model-based RL and combinations of model-based and model-free methods
- Similarities and differences between MPC and RL

Contact: hoffmaja@informatik.uni-freiburg.de, florian.messerer@imtek.uni-freiburg.de

### Calendar

- October 4 to 7
- Block course with lectures, exercises and supervised project work
- 9:00-18:00, HS 1098, KG I (dowtown campus, Platz der Universität 3)

- October 7 @ 9 am
- Micro exam

- October 8 @ 10 am:
- Joined hike to Ruine Schneeburg at Schönberg (we meet at the tram station Innsbrucker Straße).

- October 25, 2pm - 4pm
- Project session. Present your results (voluntary)
- Location: SR 02-012(16), Georges-Koehler-Allee 102 (Faculty of Engineering / Technische Fakultät)

- November 8, 23.59pm
- Project report deadline

### Microexam

- Micro exam
- Micro exam solution
- Results: microexam_results.txt
- The maximum number of points was reduced to 19. Question number 7 was not considered due to a typo.
- The required points are 9. If you did not choose an pseudonym you can write a mail to the tutors to get your result.

### Slides & Blackboard pictures

- Technologies behind AlphaGo Zero: slides
- Three Horizon MPC: blackboard_1, blackboard_2, blackboard_3
- Real-time Iteration (RTI): blackboard
- Acados presentation: slides
- Monte Carlo & Temporal Difference: slides
- Alternative Views on Dynamic Programming: blackboard_1, blackboard_2
- Nonsmooth Numerical Optimal Control: slides, animations, NOSNOC github
- Function Approximation, DQN, DDPG: slides

### Project

To register your team please fill out the following form.

If you are presenting your intermediate results in the spotlight session, you can either send your slides as pdf to the teaching assistants, or join the following Zoom session to share your screen.

https://uni-freiburg.zoom.us/j/66135504785?pwd=R01paml1QkNVZEJHMmgzZnNRa0xrZz09

Meeting ID: 661 3550 4785

Passcode: mpcrl2022

### Tutorials / Exercises

- CasADi
- Installation
- Simulation
- Optimal Control

- PyTorch for RL
- acados Workshop

### Further Resources

- Sample Micro Exam: exam, exam solution
- Additional links for the project work:
- StableBaselines3: A good RL library that works out of the box for gym interfaces.
- minimalRL: A minimalist RL library that has each algorithm in one file.
- Tips&Tricks: A guide on important design choices in doing reinforcement learning experiments.

- Books
- J. B. Rawlings, D. Q. Mayne, M. Diehl. Model Predictive Control. Nobhill Publishing, 2017 (free PDF here)
- R. S. Sutton, Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, 2018 (free PDF here)
- D. P. Bertsekas. Reinforcement Learning and Optimal Control. Athena Scientific, 2019

- Papers mentioned in discussion
- G. N. Iyengar. Robust Dynamic Programming. Mathematics of Operations Research 30(2), 2005 (link)
- D. Q. Mayne et al. Constrained Model Predictive Control: Stability and Optimality. Automatica 36(6), 2000 (link)
- H. Chen, F. Allgöwer. A Quasi-Infinite Horizon Nonlinear Model Predictive Control Scheme with Guaranteed Stability. Automatica 34(10), 1998 (link)
- G. Frison et al. An Efficient Implementation of Partial Condensing for Nonlinear Model Predictive Control. IEEE 55th Conference on Decision and Control (CDC), 2016 (link)

- Mensa menu

### Formal requirements

*Relevant only for students of the university of Freiburg.*

In order to receive ECTS for this course, students need to pass *all* of the following:

- Studienleistung (SL, ungraded)
- Micro exam

- Prüfungsleistung (PL, graded)
- Project report

For registering, please fill this form on the first day of the course (Oct 4). *Every* student from the University of Freiburg needs to fill this form, even if you already registered for something during the summer semester exam registration phase.

### Prerequisites

To be able to treat the contents on a sufficiently high level, we will assume some basic knowledge of Numerical Optimal Control and Reinforcement learning. Students should be familiar with the following contents before the start of the course, namely the first 5 lectures of last year's MPCRL course:

- Lecture 1 - Introduction - Joschka Boedecker and Moritz Diehl, Video
- Lecture 2 - Dynamic Systems and Simulation (Moritz Diehl), Video
- Lecture 3 - Numerical Optimization (Moritz Diehl), Video
- Lecture 4 - Dynamic Programming and LQR - Moritz Diehl, Video
- Lecture 5 - MDPs, PI and VI - Joschka Boedecker, Video