Reinforcement Learning (RL) can compute optimal strategies for accomplishing difficult tasks in complex scenarios. However, most RL algorithms do not provide safety and performance guarantees during the deployment phase. This is a critical drawback when RL is applied to cyber–physical systems such as robotic manipulators, where the goal is to always and safely converge to a desired goal or equilibrium state. Specifically, one fundamental safety requirement for robotic systems is closed-loop L2-stability, which has passivity as a sufficient condition. This paper proposes a novel switched RL control scheme for robotic systems, with passivity and asymptotic stability guarantees. This combines RL over constrained Markov decision processes, for passive training and inference, with Linear Quadratic Regulation (LQR) for asymptotic convergence to the desired equilibrium point. During RL training, the energy stored in the system is monitored via the virtual energy tank approach to train a cost critic function. During inference, the virtual energy tank modulates the command input to guarantee passivity. Finally, the reward design of the RL agent is based on the Lyapunov function associated with LQR control, in order to steer the system state towards the LQR basin of attraction, where a switching mechanism is triggered to guarantee asymptotic convergence. We compare our methodology with a model-based controller and other RL and model-based architectures applied to a paradigmatic under-actuated cart–pole system, an instance of a 2-DOF robotic manipulator, both in simulation and on a real setup. We also test the generality of our approach, with an experiment on a 6-DOF manipulator in simulation. The experimental validation shows that our methodology performs better in training and inference, even in the presence of plant modelling errors, while guaranteeing passivity and safety in the presence of large disruptive disturbances.
Passive Reinforcement Learning with Optimal Control for Safe Convergence in Cyber–physical Systems
Piccinelli, Nicola;Meli, Daniele;Muradore, Riccardo
2026-01-01
Abstract
Reinforcement Learning (RL) can compute optimal strategies for accomplishing difficult tasks in complex scenarios. However, most RL algorithms do not provide safety and performance guarantees during the deployment phase. This is a critical drawback when RL is applied to cyber–physical systems such as robotic manipulators, where the goal is to always and safely converge to a desired goal or equilibrium state. Specifically, one fundamental safety requirement for robotic systems is closed-loop L2-stability, which has passivity as a sufficient condition. This paper proposes a novel switched RL control scheme for robotic systems, with passivity and asymptotic stability guarantees. This combines RL over constrained Markov decision processes, for passive training and inference, with Linear Quadratic Regulation (LQR) for asymptotic convergence to the desired equilibrium point. During RL training, the energy stored in the system is monitored via the virtual energy tank approach to train a cost critic function. During inference, the virtual energy tank modulates the command input to guarantee passivity. Finally, the reward design of the RL agent is based on the Lyapunov function associated with LQR control, in order to steer the system state towards the LQR basin of attraction, where a switching mechanism is triggered to guarantee asymptotic convergence. We compare our methodology with a model-based controller and other RL and model-based architectures applied to a paradigmatic under-actuated cart–pole system, an instance of a 2-DOF robotic manipulator, both in simulation and on a real setup. We also test the generality of our approach, with an experiment on a 6-DOF manipulator in simulation. The experimental validation shows that our methodology performs better in training and inference, even in the presence of plant modelling errors, while guaranteeing passivity and safety in the presence of large disruptive disturbances.| File | Dimensione | Formato | |
|---|---|---|---|
|
1-s2.0-S0921889025003902-main.pdf
accesso aperto
Tipologia:
Versione dell'editore
Licenza:
Creative commons
Dimensione
2.71 MB
Formato
Adobe PDF
|
2.71 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



