In the last years there have been significant improvements in the accuracy of real-time 3D skeletal data estimation software. These applications based on convolutional neural networks (CNNs) can play a key role in a variety of clinical scenarios, from gait analysis to medical diagnosis. One of the main challenges is to apply such intelligent video analytic at a distance, which requires the system to satisfy, beside accuracy, also data privacy. To satisfy privacy by default and by design, the software has to run on ”edge” computing devices, by which the sensitive information (i.e., the video stream) is elaborated close to the camera while only the process results can be stored or sent over the communication network. In this paper we address such a challenge by evaluating the accuracy of the state-of-the-art software for human pose estimation when run ”at the edge”. We show how the most accurate platforms for pose estimation based on complex and deep neural networks can become inaccurate due to subsampling of the input video frames when run on the resource constrained edge devices. In contrast, we show that, starting from less accurate and ”lighter” CNNs and enhancing the pose estimation software with filters and interpolation primitives, the platform achieves better real- time performance and higher accuracy with a deviation below the error tolerance of a marker-based motion capture system.
Preserving data privacy and accuracy of human pose estimation software based on CNNs for remote gait analysis
Enrico Martini;Michele Boldo;Stefano Aldegheri;Nicola Vale`;Mirko Filippetti;Nicola Smania;Matteo Bertucco;Alessandro Picelli;Nicola Bombieri
2022-01-01
Abstract
In the last years there have been significant improvements in the accuracy of real-time 3D skeletal data estimation software. These applications based on convolutional neural networks (CNNs) can play a key role in a variety of clinical scenarios, from gait analysis to medical diagnosis. One of the main challenges is to apply such intelligent video analytic at a distance, which requires the system to satisfy, beside accuracy, also data privacy. To satisfy privacy by default and by design, the software has to run on ”edge” computing devices, by which the sensitive information (i.e., the video stream) is elaborated close to the camera while only the process results can be stored or sent over the communication network. In this paper we address such a challenge by evaluating the accuracy of the state-of-the-art software for human pose estimation when run ”at the edge”. We show how the most accurate platforms for pose estimation based on complex and deep neural networks can become inaccurate due to subsampling of the input video frames when run on the resource constrained edge devices. In contrast, we show that, starting from less accurate and ”lighter” CNNs and enhancing the pose estimation software with filters and interpolation primitives, the platform achieves better real- time performance and higher accuracy with a deviation below the error tolerance of a marker-based motion capture system.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.