Re-identification is generally carried out by encoding the appearance of a subject in terms of outfit, suggesting scenarios where people do not change their attire. In this paper we overcome this restriction, by proposing a framework based on a deep convolutional neural network, SOMAnet, that additionally models other discriminative aspects, namely, structural attributes of the human figure (e.g. height, obesity, gender). Our method is unique in many respects. First, SOMAnet is based on the Inception architecture, departing from the usual siamese framework. This spares expensive data preparation (pairing images across cameras) and allows the understanding of what the network learned. Second, and most notably, the training data consists of a synthetic 100K instance dataset, SOMAset, created by photorealistic human body generation software. SOMAset will be released with a open source license to enable further developments in re-identification. Synthetic data represents a cost-effective way of acquiring semi-realistic imagery (full realism is usually not required in re-identification since surveillance cameras capture low-resolution silhouettes), while at the same time providing complete control of the samples in terms of ground truth. Thus it is relatively easy to customize the data w.r.t. the surveillance scenario at-hand, e.g. ethnicity. SOMAnet, trained on SOMAset and fine-tuned on recent re-identification benchmarks, matches subjects even with different apparel

Looking beyond appearances: Synthetic training data for deep CNNs in re-identification

Cristani, Marco
;
2018-01-01

Abstract

Re-identification is generally carried out by encoding the appearance of a subject in terms of outfit, suggesting scenarios where people do not change their attire. In this paper we overcome this restriction, by proposing a framework based on a deep convolutional neural network, SOMAnet, that additionally models other discriminative aspects, namely, structural attributes of the human figure (e.g. height, obesity, gender). Our method is unique in many respects. First, SOMAnet is based on the Inception architecture, departing from the usual siamese framework. This spares expensive data preparation (pairing images across cameras) and allows the understanding of what the network learned. Second, and most notably, the training data consists of a synthetic 100K instance dataset, SOMAset, created by photorealistic human body generation software. SOMAset will be released with a open source license to enable further developments in re-identification. Synthetic data represents a cost-effective way of acquiring semi-realistic imagery (full realism is usually not required in re-identification since surveillance cameras capture low-resolution silhouettes), while at the same time providing complete control of the samples in terms of ground truth. Thus it is relatively easy to customize the data w.r.t. the surveillance scenario at-hand, e.g. ethnicity. SOMAnet, trained on SOMAset and fine-tuned on recent re-identification benchmarks, matches subjects even with different apparel
2018
Re-identification; Deep learning; Training set; Automated training dataset generation; Re-identification photorealistic dataset
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/976597
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 103
  • ???jsp.display-item.citation.isi??? 101
social impact