Predict, Control, Learn: Combining Model Predictive Control and Reinforcement Learning

Tutorial Session at the European Control Conference 2026

Thursday, July 09, 2026, 12:00 - 14:00

ECC 2026

Model Predictive Control (MPC) and Reinforcement Learning (RL) represent two prominent frameworks for optimal decision-making under uncertainty. Both fields stem from similar fundamental principles, and are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from different communities and requirements. While interest in combining these fields has grown significantly, the resulting landscape of hybrid methods can be difficult to navigate. This tutorial session aims to provide both control researchers and practitioners with a comprehensive overview on the theoretical foundations, algorithmic architectures, and open challenges essential for the synthesis of MPC and RL. Furthermore, it focuses on state-of-the-art software aimed at rapid development of novel methods combining the two paradigms.

Fall School: MPCRL26.

You want to learn more about combining MPC and RL using acados and leap-c? Check out our fall school!

Program.

Markov Decision Processes – A Unifying Perspective on MPC and RL

This introductory talk establishes the foundational bridge between Model Predictive Control (MPC) and Reinforcement Learning (RL). Starting from a unified Markov Decision Process (MDP) formulation and consistent notation, we provide a high-level overview of both paradigms: examining RL through value functions and policies, and MPC via receding-horizon optimization. By highlighting recent success stories, we present a systematic comparison of their respective strengths in handling constraints, model mismatch, stochasticity, and online computational requirements. This presentation serves as the technical primer for the advanced synthesis and software topics covered in the remainder of the tutorial.

Speaker: Jasper Hoffmann (Department of Intelligent Machine Brain Interfaces, University Freiburg)
Slides: pdf

Synthesis of Model Predictive Control and Reinforcement Learning

This talk provides a structured classification of hybrid Model Predictive Control (MPC) and Reinforcement Learning (RL) algorithms. We introduce the core inference architectures—ranging from hierarchical to integrated structures—that define how neural networks interface with optimization layers. Using the actor-critic framework, we categorize these methods by the functional role of MPC: as an expert for guidance, a learnable critic, or the deployed policy itself. Finally, we contrast aligned learning, which maintains physical interpretability, with closed-loop learning, which optimizes for end-to-end performance. This overview offers a clear roadmap for navigating the diverse landscape of modern MPC-RL synthesis. The talk is based on the recent survey [1].

Speaker: Rudolf Reiter (Robotics and Perception Group, University of Zurich)
Slides: pdf

Reinforcement Learning Over MPC: Why Does It Work?

This talk establishes the theoretical foundations connecting Model Predictive Control and Reinforcement Learning approaches through the unifying lens of Markov Decision Processes (MDPs). We demonstrate how MPC can be formally viewed as an approximate solution to the underlying MDP, and identify the key paradigm shifts that enable their effective combination: moving from fitting dynamics models to optimizing closed-loop performance, and adopting a “holistic” parametrization where the entire MPC scheme—including cost functions, constraints, and terminal conditions—becomes learnable.

Speaker: Dirk Reinhardt (Department of Engineering Cybernetics, Norwegian University of Science and Technology)
Slides: pdf

Differentiable NMPC with acados

The efficient computation of parametric solution sensitivities is a key challenge in the integration of learning-enhanced methods with nonlinear model predictive control, as their availability is crucial for many learning algorithms. In this talk, we discuss the computation of solution sensitivities of general nonlinear programs using the implicit function theorem and smoothed optimality conditions as used in interior-point methods. Furthermore, we provide a detailed analysis for sensitivity computation within a sequential quadratic programming method which employs an interior point method for the quadratic subproblems.
As a practical example, we present the efficient open-source implementation within the acados framework [3], providing both forward and adjoint sensitivities for general optimal control problems, achieving speedups exceeding 3x over the state-of-the-art solvers mpc.pytorch and cvxpygen. The talk is based on the preprint [2].

Speaker: Katrin Baumgärtner (Department of Microsystems Engineering, University Freiburg)
Slides: pdf

MPC-RL with leap-c

While MPC-RL research is ongoing, available software tools for MPC-RL methods are scarce. In this talk, we present the software package leap-c (Learning Predictive Control) [4], which extends the domain of said tools. It leverages the capabilities of acados to provide a fast, versatile and differentiable MPC layer as a PyTorch module. Aside of this core functionality, the package provides some examples and different instances of MPC-RL algorithms, combining Soft Actor-Critic (SAC) with an MPC controller in a hierarchical structure. We discuss these algorithms in more detail, highlighting the different design choices that lead to them.

Speaker: Jasper Hoffmann (Department of Intelligent Machine Brain Interfaces, University Freiburg)
Slides: pdf

Supplemental material.

[1] Rudolf Reiter, Jasper Hoffmann, Dirk Reinhardt, Florian Messerer, Katrin Baumgärtner, Shambhuraj Sawant, Joschka Bödecker, Moritz Diehl, Sebastien Gros: Synthesis of model predictive control and reinforcement learning: Survey and classification, Annual Reviews in Control, Volume 61, 2026

[2] Frey, Jonathan, Katrin Baumgärtner, Gianluca Frison, Dirk Reinhardt, Jasper Hoffmann, Leonard Fichtner, Sebastien Gros, and Moritz Diehl: Differentiable nonlinear model predictive control, arXiv preprint arXiv:2505.01353, 2025

[3] acados

[4] leap-c

Systems Control and Optimization Laboratory

IMTEK, Faculty of Engineering, University of Freiburg

Predict, Control, Learn: Combining Model Predictive Control and Reinforcement Learning

Tutorial Session at the European Control Conference 2026