the proposed method, has more than twice as much
concept detectors in the category scene than the basis
suggested by UIBE. Additionally its concept detec-
tors are able to detect 25 scene concepts more than
the detectors of UIBE. For a network that is trained
to do scene classification this suggestion looks highly
plausible. An even more prominent result regards
ResNet50. Compared to UIBE, the proposed method
suggests a basis with 10 times more concept detec-
tors, for concepts in the action category. Addition-
ally, the respective basis’ concept detectors can detect
47 more action concepts than the detectors of UIBE.
This suggestion aligns better with the goal of a net-
work that is trained to perform action recognition.
In this work we proposed to complement previous
work (UIBE) with a novel loss term, that exploits
the knowledge encoded in CNN image classifiers and
suggests more interpretable bases. The proposed
method demonstrates up to 45.8% interpretability im-
provements in the extracted bases, when using opti-
mal hyper-parameters that were suggested for learn-
ing a basis regarding a different classifier trained on
another task. Future work may study applications of
the proposed method to debug and improve model
This work has been supported by the EC funded Hori-
zon Europe Framework Programme: CAVAA Grant
Agreement 101071178.
