In state-of-the-art deep single-label classification models, the top-k (k = 2,3,4,....) accuracy is usually significantly higher than the top-1 accuracy. This is more evident in fine-grained datasets, where differences between classes are quite subtle. Exploiting the information provided in the top k predicted classes boosts the final prediction of a model. We propose Guided Zoom, a novel way in which explainabitity could be used to improve model performance. We do so by making sure the model has "the right reasons" fora prediction. The reason/evidence upon which a deep neural network makes a prediction is defined to be the grounding, in the pixel space, for a specific class conditional probability in the model output. Guided Zoom examines how reasonable the evidence used to make each of the top-k predictions is. Test time evidence is deemed reasonable if it is coherent with evidence used to make similar correct decisions at training time. This leads to better informed predictions. We explore a variety of grounding techniques and study their complementarity for computing evidence. We show that Guided Zoom results in an improvement of a model's classification accuracy and achieves state-of-the-art classification performance on four fine-grained classification datasets.

Guided Zoom: Zooming into Network Evidence to Refine Fine-Grained Model Decisions

Murino, V;
2021

Abstract

In state-of-the-art deep single-label classification models, the top-k (k = 2,3,4,....) accuracy is usually significantly higher than the top-1 accuracy. This is more evident in fine-grained datasets, where differences between classes are quite subtle. Exploiting the information provided in the top k predicted classes boosts the final prediction of a model. We propose Guided Zoom, a novel way in which explainabitity could be used to improve model performance. We do so by making sure the model has "the right reasons" fora prediction. The reason/evidence upon which a deep neural network makes a prediction is defined to be the grounding, in the pixel space, for a specific class conditional probability in the model output. Guided Zoom examines how reasonable the evidence used to make each of the top-k predictions is. Test time evidence is deemed reasonable if it is coherent with evidence used to make similar correct decisions at training time. This leads to better informed predictions. We explore a variety of grounding techniques and study their complementarity for computing evidence. We show that Guided Zoom results in an improvement of a model's classification accuracy and achieves state-of-the-art classification performance on four fine-grained classification datasets.
Explainable AI
grounding
saliency
fine-grained image classification
classification refinement
convolutional neural networks
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11562/1060737
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 3
social impact