Authors:
Alexandre Lambert
1
;
2
;
3
;
Aakash Soni
1
;
Assia Soukane
1
;
Amar Ramdane Cherif
2
and
Arnaud Rabat
3
Affiliations:
1
LyRIDS, ECE Research Center Paris, France
;
2
LISV Laboratory, Université de Versailles, Paris Saclay, Velizy, France
;
3
Unité d’Ergonomie Cognitive des Situations Opérationnelles, IRBA, Brétigny sur Orge, France
Keyword(s):
Activations Explainability, Concept Extraction and Visualization, Clustering.
Abstract:
Despite significant advances in computer vision with deep learning models (e.g. classification, detection, and segmentation), these models remain complex, making it challenging to assess their reliability, interpretability, and consistency under diverse. There is growing interest in methods for extracting human-understandable concepts from these models, but significant challenges persist. These challenges include difficulties in extracting concepts relevant to both model parameters and inference while ensuring the concepts are meaningful to individuals with varying expertise levels without requiring a panel of evaluators to validate the extracted concepts. To tackle these challenges, we propose concept extraction by clustering activations. Activations represent a model’s internal state based on its training, and can be grouped to represent learned concepts. We propose two clustering methods for concept extraction, a metric for evaluating their importance, and a concept visualization
technique for concept interpretation. This approach can help identify biases in models and datasets.
(More)