We introduce HARPER, a novel dataset for 3D body pose estimation and forecasting in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty of HARPER is its focus on the robot’s perspective, i.e., on the data captured by the robot’s sensors. This makes 3D body pose analysis challenging, as being close to the ground results in only partial captures of humans. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users. The corpus contains recordings not only from Spot’s built-in stereo cameras but also from a 6-camera OptiTrack system, with all recordings synchronized. This setup leads to ground-truth skeletal representations with a precision of less than a millimeter. Additionally, the corpus includes reproducible benchmarks for 3D Human Pose Estimation, Human Pose Forecasting, and Collision Prediction, all based on publicly available baseline approaches. This enables future HARPER users to rigorously compare their results with those provided in this work.
Exploring 3D Human Pose Estimation and Forecasting from the Robot’s Perspective: The HARPER Dataset
Avogaro, Andrea
;Toaiari, Andrea;Cunico, Federico;Vinciarelli, Alessandro;Cristani, Marco
2024-01-01
Abstract
We introduce HARPER, a novel dataset for 3D body pose estimation and forecasting in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty of HARPER is its focus on the robot’s perspective, i.e., on the data captured by the robot’s sensors. This makes 3D body pose analysis challenging, as being close to the ground results in only partial captures of humans. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users. The corpus contains recordings not only from Spot’s built-in stereo cameras but also from a 6-camera OptiTrack system, with all recordings synchronized. This setup leads to ground-truth skeletal representations with a precision of less than a millimeter. Additionally, the corpus includes reproducible benchmarks for 3D Human Pose Estimation, Human Pose Forecasting, and Collision Prediction, all based on publicly available baseline approaches. This enables future HARPER users to rigorously compare their results with those provided in this work.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.