In zero-shot learning (ZSL), the task of recognizing unseen categories when no data for training is available, state-of-the-art methods generate visual features from semantic auxiliary information (e.g., attributes). In this work, we propose a valid alternative (simpler, yet better scoring) to fulfill the very same task. We observe that, if first- and second-order statistics of the classes to be recognized were known, sampling from Gaussian distributions would synthesize visual features that are almost identical to the real ones as per classification purposes. We propose a novel mathematical framework to estimate first- and second-order statistics, even for unseen classes: our framework builds upon prior compatibility functions for ZSL and does not require additional training. Endowed with such statistics, we take advantage of a pool of class-specific Gaussian distributions to solve the feature generation stage through sampling. We exploit an ensemble mechanism to aggregate a pool of softmax classifiers, each trained in a one-seen-class-out fashion to better balance the performance over seen and unseen classes. Neural distillation is finally applied to fuse the ensemble into a single architecture which can perform inference through one forward pass only. Our method, termed Distilled Ensemble of Gaussian Generators, scores favorably with respect to state-of-the-art works.

No Adversaries to Zero-Shot Learning: Distilling an Ensemble of Gaussian Feature Generators

Vittorio Murino;Alessio Del Bue
2023-01-01

Abstract

In zero-shot learning (ZSL), the task of recognizing unseen categories when no data for training is available, state-of-the-art methods generate visual features from semantic auxiliary information (e.g., attributes). In this work, we propose a valid alternative (simpler, yet better scoring) to fulfill the very same task. We observe that, if first- and second-order statistics of the classes to be recognized were known, sampling from Gaussian distributions would synthesize visual features that are almost identical to the real ones as per classification purposes. We propose a novel mathematical framework to estimate first- and second-order statistics, even for unseen classes: our framework builds upon prior compatibility functions for ZSL and does not require additional training. Endowed with such statistics, we take advantage of a pool of class-specific Gaussian distributions to solve the feature generation stage through sampling. We exploit an ensemble mechanism to aggregate a pool of softmax classifiers, each trained in a one-seen-class-out fashion to better balance the performance over seen and unseen classes. Neural distillation is finally applied to fuse the ensemble into a single architecture which can perform inference through one forward pass only. Our method, termed Distilled Ensemble of Gaussian Generators, scores favorably with respect to state-of-the-art works.
2023
Model ensemble
feature generation
inductive (generalized) zero-shot learning
neural distillation
object recognition
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1122726
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 1
social impact