Gaze Target Detection (GTD), i.e., determining where a per-son is looking within a scene from an external viewpoint, is a challenging task, particularly in 3D space. Existing approaches heavily rely on ana-lyzing the person’s appearance, primarily focusing on their face to pre-dict the gaze target. This paper presents a novel approach to tackle this problem by utilizing the person’s upper-body pose and available depth maps to extract a 3D gaze direction and employing a multi-stage or an end-to-end pipeline to predict the gazed target. When predicted accu-rately, the human body pose can provide valuable information about the head pose, which is a good approximation of the gaze direction, as well as the position of the arms and hands, which are linked to the activ-ity the person is performing and the objects they are likely focusing on. Consequently, in addition to performing gaze estimation in 3D, we are also able to perform GTD simultaneously. We demonstrate state-of-the-art results on the most comprehensive publicly accessible 3D gaze target detection dataset without requiring images of the person’s face, thus pro-moting privacy preservation in various application contexts. The code is available at https://github.com/intelligolabs/privacy-gtd-3D.

Upper-Body Pose-Based Gaze Estimation for Privacy-Preserving 3D Gaze Target Detection

Andrea Toaiari
;
Vittorio Murino;Marco Cristani;Cigdem Beyan
2025-01-01

Abstract

Gaze Target Detection (GTD), i.e., determining where a per-son is looking within a scene from an external viewpoint, is a challenging task, particularly in 3D space. Existing approaches heavily rely on ana-lyzing the person’s appearance, primarily focusing on their face to pre-dict the gaze target. This paper presents a novel approach to tackle this problem by utilizing the person’s upper-body pose and available depth maps to extract a 3D gaze direction and employing a multi-stage or an end-to-end pipeline to predict the gazed target. When predicted accu-rately, the human body pose can provide valuable information about the head pose, which is a good approximation of the gaze direction, as well as the position of the arms and hands, which are linked to the activ-ity the person is performing and the objects they are likely focusing on. Consequently, in addition to performing gaze estimation in 3D, we are also able to perform GTD simultaneously. We demonstrate state-of-the-art results on the most comprehensive publicly accessible 3D gaze target detection dataset without requiring images of the person’s face, thus pro-moting privacy preservation in various application contexts. The code is available at https://github.com/intelligolabs/privacy-gtd-3D.
2025
3D gaze target detection, gaze estimation, human pose estimation, upper-body pose, depth map, multimodal, privacy-preserving
File in questo prodotto:
File Dimensione Formato  
978-3-031-91575-8_22.pdf

non disponibili

Tipologia: Documento in Post-print
Licenza: Copyright dell'editore
Dimensione 1.84 MB
Formato Adobe PDF
1.84 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1163287
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact