This paper presents an unsupervised feature learning approach based on 3D-skeleton data for human action and human discrete emotion recognition. Relying on the time series of skeleton data analysis to perform such tasks is effective and important to preserve the individual's privacy better. Besides, such methods can represent a viable alternative to emotion recognition applications, in which most works use frontal or profile facial images disclosing the subject's appearance. On the other hand, current unsupervised methods are able to encode the high variety of contexts and nature of the data, but often at the expense of a higher model complexity or longer computational time. To lessen these shortcomings, this paper proposes a convolutional residual autoencoder that models the skeletal geometry across the temporal dynamics of the data without relying on computationally expensive recurrent architectures. Our approach also implements a Graph Laplacian Regularization leveraging upon the implicit skeleton joints connectivity, further improving the robustness of the feature embeddings learned without using action or emotion labels. It was validated on large-scale datasets, having variability in the domain, the input skeleton data (e.g. the number of joints, adjacency matrices), and sensor technology. The results show its effectiveness by notably surpassing the performance of the state-of-the-art unsupervised methods while also achieving better recognition scores compared to the several fully supervised approaches. Extensive experimental analysis proves the usefulness of the proposed method under various evaluation protocols with observed higher-quality feature representations, even if when it is trained with fewer data. The results highlight the proposed method's remarkable transfer-ability across various domains, and its faster inference time.

Graph Laplacian-Improved Convolutional Residual Autoencoder for Unsupervised Human Action and Emotion Recognition

Beyan, Cigdem
;
Del Bue, Alessio
2022-01-01

Abstract

This paper presents an unsupervised feature learning approach based on 3D-skeleton data for human action and human discrete emotion recognition. Relying on the time series of skeleton data analysis to perform such tasks is effective and important to preserve the individual's privacy better. Besides, such methods can represent a viable alternative to emotion recognition applications, in which most works use frontal or profile facial images disclosing the subject's appearance. On the other hand, current unsupervised methods are able to encode the high variety of contexts and nature of the data, but often at the expense of a higher model complexity or longer computational time. To lessen these shortcomings, this paper proposes a convolutional residual autoencoder that models the skeletal geometry across the temporal dynamics of the data without relying on computationally expensive recurrent architectures. Our approach also implements a Graph Laplacian Regularization leveraging upon the implicit skeleton joints connectivity, further improving the robustness of the feature embeddings learned without using action or emotion labels. It was validated on large-scale datasets, having variability in the domain, the input skeleton data (e.g. the number of joints, adjacency matrices), and sensor technology. The results show its effectiveness by notably surpassing the performance of the state-of-the-art unsupervised methods while also achieving better recognition scores compared to the several fully supervised approaches. Extensive experimental analysis proves the usefulness of the proposed method under various evaluation protocols with observed higher-quality feature representations, even if when it is trained with fewer data. The results highlight the proposed method's remarkable transfer-ability across various domains, and its faster inference time.
2022
Action recognition
autoencoder
emotion recognition
full-body movement
graph Laplacian
skeletal data
unsupervised feature learning
File in questo prodotto:
File Dimensione Formato  
IJ18_Graph_Laplacian-Improved_Convolutional.pdf

accesso aperto

Tipologia: Versione dell'editore
Licenza: Dominio pubblico
Dimensione 3.03 MB
Formato Adobe PDF
3.03 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1121833
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact