CATALOGO DEI PRODOTTI DELLA RICERCA

Surgical action recognition and temporal segmentation is a building block needed to provide some degrees of autonomy to surgical robots. In this paper, we present a deep learning model that relies on videos and kinematic data to output in real-time the current action in a surgical procedure. The proposed neural network architecture is composed of two sub-networks: a Spatial-Kinematic Network, which produces high-level features by processing images and kinematic data, and a Temporal Convolutional Network, which filters such features temporally over a sliding window to stabilize their changes over time. Since we are interested in applications to real-time supervisory control of robots, we focus on an efficient and causal implementation, i.e. the prediction at sample k only depends on previous observations. We tested our causal architecture on the publicly available JIGSAWS dataset, outperforming comparable state-of-the-art non-causal algorithms up to 8.6% in the edit score.

A Multi-Modal Learning System for On-Line Surgical Action Segmentation

Giacomo De Rossi;Serena Roin;Francesco Setti;Riccardo Muradore

2020-01-01

Abstract

Surgical action recognition and temporal segmentation is a building block needed to provide some degrees of autonomy to surgical robots. In this paper, we present a deep learning model that relies on videos and kinematic data to output in real-time the current action in a surgical procedure. The proposed neural network architecture is composed of two sub-networks: a Spatial-Kinematic Network, which produces high-level features by processing images and kinematic data, and a Temporal Convolutional Network, which filters such features temporally over a sliding window to stabilize their changes over time. Since we are interested in applications to real-time supervisory control of robots, we focus on an efficient and causal implementation, i.e. the prediction at sample k only depends on previous observations. We tested our causal architecture on the publicly available JIGSAWS dataset, outperforming comparable state-of-the-art non-causal algorithms up to 8.6% in the edit score.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Parole Chiave
	
				Multi-Modal Learning, Action Segmentation, Surgical robot
			
	Appare nelle tipologie:
	
				04.01 Contributo in atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1015035

Citazioni

ND

2

1

social impact