Technological advancement has profoundly impacted how people share meals, fostering research interest in new forms of commensality such as tele-dining and eating with artificial companions. Consequently, there is a need to develop computational methods for recognizing commensal activities, that is, actions related to food consumption and social signals displayed during meal-time. This paper introduces the first dataset that consists of synchronized video data of co-located dining dyads. The dataset is annotated with key social signals such as speaking activity, smiling, and food-related activities like chewing and food intake. Unlike previous studies that use remote settings, this work emphasizes the differences between online and co-located setups. A set of machine learning experiments is conducted on our and existing datasets, reaching the best F-score of 0.82. The cross-dataset analysis between co-located and online datasets also evidences the significant disparity between these two settings. While mixing co-located and online recordings may increase the model’s generalizability, we notice strong differences between the two settings, highlighting the importance of in-person data recordings for accurate recognition.
Automatic Recognition of Commensal Activities in Co-located and Online settings
Cigdem Beyan
;
2024-01-01
Abstract
Technological advancement has profoundly impacted how people share meals, fostering research interest in new forms of commensality such as tele-dining and eating with artificial companions. Consequently, there is a need to develop computational methods for recognizing commensal activities, that is, actions related to food consumption and social signals displayed during meal-time. This paper introduces the first dataset that consists of synchronized video data of co-located dining dyads. The dataset is annotated with key social signals such as speaking activity, smiling, and food-related activities like chewing and food intake. Unlike previous studies that use remote settings, this work emphasizes the differences between online and co-located setups. A set of machine learning experiments is conducted on our and existing datasets, reaching the best F-score of 0.82. The cross-dataset analysis between co-located and online datasets also evidences the significant disparity between these two settings. While mixing co-located and online recordings may increase the model’s generalizability, we notice strong differences between the two settings, highlighting the importance of in-person data recordings for accurate recognition.File | Dimensione | Formato | |
---|---|---|---|
IC33_Automatic Recognition of Commensal Activities.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
1.58 MB
Formato
Adobe PDF
|
1.58 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.