We introduce Iterated Integrated Attributions (IIA) - a generic method for explaining the predictions of vision models. IIA employs iterative integration across the input image, the internal representations generated by the model, and their gradients, yielding precise and focused explanation maps. We demonstrate the effectiveness of IIA through comprehensive evaluations across various tasks, datasets, and network architectures. Our results showcase that IIA produces accurate explanation maps, outperforming other state-of-the-art explanation techniques.
We embrace the idea of integration from Integrated Gradients (IG) to create explanation maps that utilize both intermediate representations of the network and gradients. IIA diverges from IG in several aspects: First, IIA does not confine gradient computation to the input x. In fact, recent studies have suggested that gradients derived from internal activation maps can yield improved explanation maps. Secondly, IIA employs an iterated integral across multiple intermediate representations (such as activation or attention maps) generated during the network’s forward pass. This enables the iterative accumulation of gradients w.r.t. the representations of interest. Lastly, unlike IG, IIA does not restrict the integrand to plain gradients, but encompasses a function of the entire set of representations produced by the network and their gradients.
IIA is a generic approach that provides a solution for both CNN's and ViT's. On CNN's we use the activations of the network as the intermediate representations while on ViT we take the attention matrices.
While aggregating the activations is done easily with hadamard product on CNN's, on ViT we deal with multiple attention heads and blocks, thus we use an approach called Gradient Rollout (based on the fimiliar Attention Rollout) in order to aggregate the attention information with its gradients.
@InProceedings{Barkan_2023_ICCV,
author = {Barkan, Oren and Elisha, Yehonatan and Asher, Yuval and Eshel, Amit and Koenigstein, Noam},
title = {Visual Explanations via Iterated Integrated Attributions},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {2073-2084}
}