Visual Explanations via Iterated Integrated Attributions

Oren Barkan^*1, Yehonatan Elisha^*1, Yuval Asher², Amit Eshel², Noam Koenigstein²

¹The Open University, ²Tel Aviv University
ICCV 2023
^*Indicates Equal Contribution

Illustration of the integration process of IIA.

Abstract

We introduce Iterated Integrated Attributions (IIA) - a generic method for explaining the predictions of vision models. IIA employs iterative integration across the input image, the internal representations generated by the model, and their gradients, yielding precise and focused explanation maps. We demonstrate the effectiveness of IIA through comprehensive evaluations across various tasks, datasets, and network architectures. Our results showcase that IIA produces accurate explanation maps, outperforming other state-of-the-art explanation techniques.

How does it work?

We embrace the idea of integration from Integrated Gradients (IG) to create explanation maps that utilize both intermediate representations of the network and gradients. IIA diverges from IG in several aspects: First, IIA does not confine gradient computation to the input x. In fact, recent studies have suggested that gradients derived from internal activation maps can yield improved explanation maps. Secondly, IIA employs an iterated integral across multiple intermediate representations (such as activation or attention maps) generated during the network’s forward pass. This enables the iterative accumulation of gradients w.r.t. the representations of interest. Lastly, unlike IG, IIA does not restrict the integrand to plain gradients, but encompasses a function of the entire set of representations produced by the network and their gradients.

IIA is a generic approach that provides a solution for both CNN's and ViT's. On CNN's we use the activations of the network as the intermediate representations while on ViT we take the attention matrices.

While aggregating the activations is done easily with hadamard product on CNN's, on ViT we deal with multiple attention heads and blocks, thus we use an approach called Gradient Rollout (based on the fimiliar Attention Rollout) in order to aggregate the attention information with its gradients.

Qualitative Results: Explanation maps produced using ConvNext w.r.t. the classes (top to bottom): ‘accordion, piano accordion, squeeze box’, ‘warthog’, ‘alp’, and ‘trombone’.

Qualitative Results: Explanation maps produced using ViT-B w.r.t. the classes (top to bottom): ‘spoonbill’, ‘cello, violoncello’, ’bucket, pail’, ‘snowmobile’, and ‘tiger shark’

Ablation Study: Explanation maps produced using RN (rows 1,2) and ViT (rows 3,4) w.r.t. the classes (top to bottom): ‘bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis’,’Irish terrier’, ’alp’, ’Egyptian cat’.

BibTeX

@InProceedings{Barkan_2023_ICCV,
    author    = {Barkan, Oren and Elisha‬‏, ‪Yehonatan and Asher, Yuval and Eshel, Amit and Koenigstein, Noam},
    title     = {Visual Explanations via Iterated Integrated Attributions},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {2073-2084}
}