This work represents a first step towards a systematic analysis of the impact of the choice of the baseline signals to be used in explainable baseline-dependent methods for multi-modal and multi-dimensional data relying on single-input deep networks, in view of the generalization to multi -channel architectures. This point is critical for ensuring the soundness of the attribution values and enabling their subsequent validation through association studies. In this work, two different CNNs were implemented to predict Alzheimer's disease patients from control subjects using structural Magnetic Resonance Imaging volumes and genetics data. The Integrated Gradients method was applied to both models for post-hoc attribution visualization relying on different baselines. Differences in the attribution maps were found with respect to the attributions of the reference baseline in both modalities highlighting the importance of finding and using the 'optimal' baseline. We believe this work is highly relevant for the community in the framework of the validation of XAI post-hoc methods, as it provides evidence of the impact of the choice of the baselines for deriving feature attribution values with the Integrated Gradients method which determines the reliability of the outcomes, improving both the awareness of the users and their trust in the methods.
Objective Assessment of the Bias Introduced by Baseline Signals in XAI Attribution Methods
Dolci, Giorgio;Cruciani, Federica;Galazzo, Ilaria Boscolo;Menegaz, Gloria
2023-01-01
Abstract
This work represents a first step towards a systematic analysis of the impact of the choice of the baseline signals to be used in explainable baseline-dependent methods for multi-modal and multi-dimensional data relying on single-input deep networks, in view of the generalization to multi -channel architectures. This point is critical for ensuring the soundness of the attribution values and enabling their subsequent validation through association studies. In this work, two different CNNs were implemented to predict Alzheimer's disease patients from control subjects using structural Magnetic Resonance Imaging volumes and genetics data. The Integrated Gradients method was applied to both models for post-hoc attribution visualization relying on different baselines. Differences in the attribution maps were found with respect to the attributions of the reference baseline in both modalities highlighting the importance of finding and using the 'optimal' baseline. We believe this work is highly relevant for the community in the framework of the validation of XAI post-hoc methods, as it provides evidence of the impact of the choice of the baselines for deriving feature attribution values with the Integrated Gradients method which determines the reliability of the outcomes, improving both the awareness of the users and their trust in the methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.