CATALOGO DEI PRODOTTI DELLA RICERCA

Gaze target detection aims at determining the image location where a person is looking. While existing studies have made significant progress in this area by regressing accurate gaze heatmaps, these achievements have largely relied on access to extensive labeled datasets, which demands substantial human labor. In this paper, our goal is to reduce the reliance on the size of labeled training data for gaze target detection. To achieve this, we propose \methodnamens, an innovative approach that integrates supervised and self-supervised losses within a novel sample acquisition function to perform active learning (AL). Additionally, it utilizes pseudo-labeling to mitigate distribution shifts during the training phase. \methodname~achieves the best of all AUC results by utilizing only 40-50\% of the training data, in contrast to state-of-the-art (SOTA) gaze target detectors requiring the entire training dataset to achieve the same performance. Importantly, \methodname~quickly reaches satisfactory performance with 10-20\% of the training data, showing the effectiveness of our acquisition function, which is able to acquire the most informative samples. We provide a comprehensive experimental analysis by adapting several AL methods for the task. \methodname~outperforms AL competitors, simultaneously exhibiting superior performance compared to SOTA gaze target detectors when all are trained within a low-data regime. Code is available at https://github.com/francescotonini/al-gtd.

AL-GTD: Deep Active Learning for Gaze Target Detection

Francesco Tonini;Nicola Dall'Asen;Lorenzo Vaquero;Cigdem Beyan;Elisa Ricci

2024-01-01

Abstract

Gaze target detection aims at determining the image location where a person is looking. While existing studies have made significant progress in this area by regressing accurate gaze heatmaps, these achievements have largely relied on access to extensive labeled datasets, which demands substantial human labor. In this paper, our goal is to reduce the reliance on the size of labeled training data for gaze target detection. To achieve this, we propose \methodnamens, an innovative approach that integrates supervised and self-supervised losses within a novel sample acquisition function to perform active learning (AL). Additionally, it utilizes pseudo-labeling to mitigate distribution shifts during the training phase. \methodname~achieves the best of all AUC results by utilizing only 40-50\% of the training data, in contrast to state-of-the-art (SOTA) gaze target detectors requiring the entire training dataset to achieve the same performance. Importantly, \methodname~quickly reaches satisfactory performance with 10-20\% of the training data, showing the effectiveness of our acquisition function, which is able to acquire the most informative samples. We provide a comprehensive experimental analysis by adapting several AL methods for the task. \methodname~outperforms AL competitors, simultaneously exhibiting superior performance compared to SOTA gaze target detectors when all are trained within a low-data regime. Code is available at https://github.com/francescotonini/al-gtd.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Parole Chiave
	
				Gaze target detection, active learning, social signals, human-human interaction, multimodal data
			
	Appare nelle tipologie:
	
				04.01 Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
IC32_AL GTD Deep Active Learning for Gaze Target Detection.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 4.16 MB Formato Adobe PDF Visualizza/Apri	4.16 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1132651

Citazioni

ND

ND

ND

social impact