Multivariate sensor signals collected by aquatic drones involved in water monitoring: A complete dataset

Castellini, Alberto; Bloisi, Domenico; Blum, Jason; Masillo, Francesco; Farinelli, Alessandro

doi:10.1016/j.dib.2020.105436

Sensor data generated by intelligent systems, such as autonomous robots, smart buildings and other systems based on artificial intelligence, represent valuable sources of knowledge in today’s data-driven society, since they contain information about the situations these systems face during their operation. These data are usually multivariate time series since modern technologies enable the simultaneous acquisition of multiple signals during long periods of time. In this paper we present a dataset containing sensor traces of six data acquisition campaigns performed by autonomous aquatic drones involved in water monitoring. A total of 5.6 hours of navigation are available, with data coming from both lakes and rivers, and from different locations in Italy and Spain. The monitored variables concern both the internal state of the drone (e.g., battery voltage, GPS position and signals to propellers) and the state of the water (e.g., temperature, dissolved oxygen and electrical conductivity). Data were collected in the context of the EU-funded Horizon 2020 project INTCATCH (http://www.intcatch.eu) which aims to develop a new paradigm for monitoring water quality of catchments. The aquatic drones used for data acquisition are Platypus Lutra boats. Both autonomous and manual drive is used in different parts of the navigation. The dataset is analyzed in the paper “Time series segmentation for state-model generation of autonomous aquatic drones: A systematic framework” [1] by means of recent time series clustering/segmentation techniques to extract data-driven models of the situations faced by the drones in the data acquisition campaigns. These data have strong potential for reuse in other kinds of data analysis and evaluation of machine learning methods on real-world datasets [2]. Moreover, we consider this dataset valuable also for the variety of situations faced by the drone, from which machine learning techniques can learn behavioural patterns or detect anomalous activities. We also provide manual labeling for some known states of the drones, such as, drone inside/outside the water, upstream/downstream navigation, manual/autonomous drive, and drone turning, that represent a ground truth for validation purposes. Finally, the real-world nature of the dataset makes it more challenging for machine learning methods because it contains noisy samples collected while the drone was exposed to atmospheric agents and uncertain water flow conditions.

CATALOGO DEI PRODOTTI DELLA RICERCA