Model Predictive Control and Reinforcement Learning

Block Course, 04.10.2022 - 07.10.2022, 9:00-18:00, HS 1098, Kollegiengebäude I, Platz der Universität 3, 79098 Freiburg i.Br.

Lecture: Prof. Dr. Joschka Boedecker and Prof. Dr. Moritz Diehl

Exercises: Jasper Hoffmann and Florian Messerer

This block course is intended for master and PhD students from engineering, computer science, mathematics, physics, and other mathematical sciences. The aim is that participants understand the main concepts of model predictive control (MPC) and reinforcement learning (RL) as well the similarities and differences between the two approaches. In hands-on exercises and project work they learn to apply the methods to practical optimal control problems from science and engineering.

The course consists of lectures, exercises, and project work. The lectures in are given by Prof. Dr. Joschka Boedecker and Prof. Dr. Moritz Diehl from the University of Freiburg.

Topics include

Optimal Control Problem (OCP) formulations - constrained, infinite horizon, discrete time, stochastic, robust
Markov Decision Processes (MDP)
From continuous to discrete: discretization in space and time
Dynamic Programming (DP) concepts and algorithms - value iteration and policy iteration
Linear Quadratic Regulator (LQR) and Riccati equations
Convexity considerations in DP for constrained linear systems
Model predictive control (MPC) formulations and stability guarantees
MPC algorithms - quadratic programming, direct multiple shooting, Gauss-Newton, real-time iterations
Differential Dynamic Programming (DDP) for the solution of unconstrained MPC problems
Reinforcement Learning (RL) formulations and approaches
Model-free RL: Monte Carlo, temporal differences, model learning, direct policy search
RL with function approximation
Model-based RL and combinations of model-based and model-free methods
Similarities and differences between MPC and RL

Contact: hoffmaja@informatik.uni-freiburg.de, florian.messerer@imtek.uni-freiburg.de

Calendar

October 4 to 7
- Block course with lectures, exercises and supervised project work
- 9:00-18:00, HS 1098, KG I (dowtown campus, Platz der Universität 3)
October 7 @ 9 am
- Micro exam
October 8 @ 10 am:
- Joined hike to Ruine Schneeburg at Schönberg (we meet at the tram station Innsbrucker Straße).
October 25, 2pm - 4pm
- Project session. Present your results (voluntary)
- Location: SR 02-012(16), Georges-Koehler-Allee 102 (Faculty of Engineering / Technische Fakultät)
November 8, 23.59pm
- Project report deadline

	Tuesday 4.10.	Wednesday 5.10.	Thursday 6.10.	Friday 7.10.	Saturday 8.10.
09:00-09:30	Welcome + Q&A 1 Prerequisite lecture 1-3 Joschka Boedecker, Moritz Diehl	Lecture 2 Practical Aspects of Nonlinear MPC Moritz Diehl	Lecture 4 Alternative Views on Dynamic Programming / Nonsmooth Numerical Optimal Control Moritz Diehl / Armin Nurkanovic	Micro Exam
09:30-10:00
10:00-10:30					Social Event Hike onto the Schönberg: We meet at 10 am at Halte- stelle Insbrucker Straße. Weatherproof clothes are recommended!
10:30-11:00	Coffee Break	Coffee Break	Coffee Break	Coffee Break
11:00-11:30	Q&A 2 Prerequisite lecture 4-5 Joschka Boedecker, Moritz Diehl	Acados Workshop Jonathan Frey	Lecture 5 Function Approximation & Actor Critic Joschka Boedecker	Project Work
11:30-12:00
12:00-12:30
12:30-13:00	Lunch break	Lunch break	Lunch break	Lunch break
13:00-13:30
13:30-14:00
14:00-14:30	Lecture 1 Technologies behind AlphaGo Joschka Boedecker	Lecture 3 Monte Carlo & Temporal Difference Joschka Boedecker	MPC&RL - Differences, Similarities and Synergies	Project Work
14:30-15:00
15:00-15:30
15:30-16:00	Coffee Break	Coffee Break	Coffee Break	Coffee Break
16:00-16:30	Project Brainstorming	Project Work / Tutorials	Project Work / Tutorials	Spot Light Presentations (Voluntary) Preliminary project results
16:30-17:00	Project Pitches
17:00-17:30

Microexam

Micro exam
Micro exam solution
Results: microexam_results.txt
The maximum number of points was reduced to 19. Question number 7 was not considered due to a typo.
The required points are 9. If you did not choose an pseudonym you can write a mail to the tutors to get your result.

Slides & Blackboard pictures

Technologies behind AlphaGo Zero: slides
Three Horizon MPC: blackboard_1, blackboard_2, blackboard_3
Real-time Iteration (RTI): blackboard
Acados presentation: slides
Monte Carlo & Temporal Difference: slides
Alternative Views on Dynamic Programming: blackboard_1, blackboard_2
Nonsmooth Numerical Optimal Control: slides, animations, NOSNOC github
Function Approximation, DQN, DDPG: slides

Project

Project Guidelines

To register your team please fill out the following form.

If you are presenting your intermediate results in the spotlight session, you can either send your slides as pdf to the teaching assistants, or join the following Zoom session to share your screen.

https://uni-freiburg.zoom.us/j/66135504785?pwd=R01paml1QkNVZEJHMmgzZnNRa0xrZz09
Meeting ID: 661 3550 4785
Passcode: mpcrl2022

Tutorials / Exercises

CasADi
- Installation
- Simulation
  - Tutorial
  - Tutorial Solution
- Optimal Control
  - Tutorial
  - Tutorial Solution
PyTorch for RL
acados Workshop

Further Resources

Sample Micro Exam: exam, exam solution
Additional links for the project work:
- StableBaselines3: A good RL library that works out of the box for gym interfaces.
- minimalRL: A minimalist RL library that has each algorithm in one file.
- Tips&Tricks: A guide on important design choices in doing reinforcement learning experiments.
Books
- J. B. Rawlings, D. Q. Mayne, M. Diehl. Model Predictive Control. Nobhill Publishing, 2017 (free PDF here)
- R. S. Sutton, Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, 2018 (free PDF here)
- D. P. Bertsekas. Reinforcement Learning and Optimal Control. Athena Scientific, 2019
Papers mentioned in discussion
- G. N. Iyengar. Robust Dynamic Programming. Mathematics of Operations Research 30(2), 2005 (link)
- D. Q. Mayne et al. Constrained Model Predictive Control: Stability and Optimality. Automatica 36(6), 2000 (link)
- H. Chen, F. Allgöwer. A Quasi-Infinite Horizon Nonlinear Model Predictive Control Scheme with Guaranteed Stability. Automatica 34(10), 1998 (link)
- G. Frison et al. An Efficient Implementation of Partial Condensing for Nonlinear Model Predictive Control. IEEE 55th Conference on Decision and Control (CDC), 2016 (link)
Mensa menu

Formal requirements

Relevant only for students of the university of Freiburg.

In order to receive ECTS for this course, students need to pass *all* of the following:

Studienleistung (SL, ungraded)
- Micro exam
Prüfungsleistung (PL, graded)
- Project report

For registering, please fill this form on the first day of the course (Oct 4). *Every* student from the University of Freiburg needs to fill this form, even if you already registered for something during the summer semester exam registration phase.

Prerequisites

To be able to treat the contents on a sufficiently high level, we will assume some basic knowledge of Numerical Optimal Control and Reinforcement learning. Students should be familiar with the following contents before the start of the course, namely the first 5 lectures of last year's MPCRL course:

Systems Control and Optimization Laboratory

IMTEK, Faculty of Engineering, University of Freiburg