CATALOGO DEI PRODOTTI DELLA RICERCA

Social communication involves interpreting nonverbal behaviors, detecting and anticipating others’ actions and intentions. Actions convey not only the goal and motor intention but also the form, i.e., variations in action execution. These variations, termed vitality forms, communicate attitudes during interactions, such as being gentle, calm, vigorous, and rude. Automatic vitality form recognition may have several applications in social robotics, social skills training, and therapy, yet it remains a rarely studied topic. This paper introduces an unsupervised pre-training approach that utilizes 2D-body key point trajectories as input and employs diffusion models to derive more effective features for representing these trajectories. The features learned from the diffusion model’s encoder are utilized to train a multilayer perceptron for vitality form recognition. Experimental analysis showcases the superior performance of the proposed method not only across various videos but also for action classes not encountered during training.

Diffusion-Based Unsupervised Pre-training for Automated Recognition of Vitality Forms

Noemi Canovi;Federico Montagna;Radoslaw Niewiadomski;Alessandra Sciutti;Giuseppe Di Cesare;Cigdem Beyan

2024-01-01

Abstract

Social communication involves interpreting nonverbal behaviors, detecting and anticipating others’ actions and intentions. Actions convey not only the goal and motor intention but also the form, i.e., variations in action execution. These variations, termed vitality forms, communicate attitudes during interactions, such as being gentle, calm, vigorous, and rude. Automatic vitality form recognition may have several applications in social robotics, social skills training, and therapy, yet it remains a rarely studied topic. This paper introduces an unsupervised pre-training approach that utilizes 2D-body key point trajectories as input and employs diffusion models to derive more effective features for representing these trajectories. The features learned from the diffusion model’s encoder are utilized to train a multilayer perceptron for vitality form recognition. Experimental analysis showcases the superior performance of the proposed method not only across various videos but also for action classes not encountered during training.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Parole Chiave
	
				Vitality forms, nonverbal communication, unsupervised pre-training, diffusion models, autoencoders, gestures, actions, trajectory
			
	Appare nelle tipologie:
	
				04.01 Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
IC30_Diffusion Based Unsupervised Pretraining.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 1.99 MB Formato Adobe PDF Visualizza/Apri	1.99 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1125909

Citazioni

ND

2

1

social impact