We propose a novel benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation. Aquatic navigation is an extremely challenging task due to the non-stationary environment and the uncertainties of the robotic platform, hence it is crucial to consider the safety aspect of the problem, by analyzing the behavior of the trained network to avoid dangerous situations (e.g., collisions). To this end, we consider a value-based and policy-gradient Deep Reinforcement Learning (DRL) and we propose a crossover-based strategy that combines gradient-based and gradient-free DRL to improve sample-efficiency. Moreover, we propose a verification strategy based on interval analysis that checks the behavior of the trained models over a set of desired properties. Our results show that the crossover-based training outperforms prior DRL approaches, while our verification allows us to quantify the number of configurations that violate the behaviors that are described by the properties. Crucially, this will serve as a benchmark for future research in this domain of applications.

Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation

Marchesini, Enrico;Corsi, Davide;Farinelli, Alessandro
2021-01-01

Abstract

We propose a novel benchmark environment for Safe Reinforcement Learning focusing on aquatic navigation. Aquatic navigation is an extremely challenging task due to the non-stationary environment and the uncertainties of the robotic platform, hence it is crucial to consider the safety aspect of the problem, by analyzing the behavior of the trained network to avoid dangerous situations (e.g., collisions). To this end, we consider a value-based and policy-gradient Deep Reinforcement Learning (DRL) and we propose a crossover-based strategy that combines gradient-based and gradient-free DRL to improve sample-efficiency. Moreover, we propose a verification strategy based on interval analysis that checks the behavior of the trained models over a set of desired properties. Our results show that the crossover-based training outperforms prior DRL approaches, while our verification allows us to quantify the number of configurations that violate the behaviors that are described by the properties. Crucially, this will serve as a benchmark for future research in this domain of applications.
978-1-6654-1714-3
Deep Reinforcement Learning
Safe Reinforcement Learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1060942
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact