Authors:
Lucas David
;
Helio Pedrini
and
Zanoni Dias
Affiliation:
Institute of Computing, University of Campinas, Campinas, Brazil
Keyword(s):
Computer Vision, Multi-label, Explainable Artificial Intelligence.
Abstract:
The Class Activation Map (CAM) technique (and derivations thereof) has been broadly used in the literature to inspect the decision process of Convolutional Neural Networks (CNNs) in classification problems. However, most studies have focused on maximizing the coherence between the visualization map and the position, shape and sizes of a single object of interest, and little is known about the performance of visualization techniques in scenarios where multiple objects of different labels coexist. In this work, we conduct a series of tests that aim to evaluate the efficacy of CAM techniques over distinct multi-label sets. We find that techniques that were developed with single-label classification in mind (such as Grad-CAM, Grad-CAM++ and Score-CAM) will often produce diffuse visualization maps in multi-label scenarios, overstepping the boundaries of their explaining objects onto different labels. We propose a generalization of CAM technique, based on multi-label activation maximizatio
n/minimization to create more accurate activation maps. Finally, we present a regularization strategy that encourages sparse positive weights in the classifying layer, producing cleaner activation maps and better multi-label classification scores.
(More)