LLM-Generated Class Descriptions for Semantically Meaningful Image Classification

Simone Bertolotto; Simone Bertolotto; André Panisson; Alan Perotti

doi:10.5220/0013060800003886

LLM-Generated Class Descriptions for Semantically Meaningful Image Classification

Simone Bertolotto, Simone Bertolotto, André Panisson, Alan Perotti

2024

Abstract

Neural networks have become the primary approach for tackling computer vision tasks, but their lack of transparency and interpretability remains a challenge. Integrating neural networks with symbolic knowledge bases, which could provide valuable context for visual concepts, is not yet common in the machine learning community. In image classification, class labels are often treated as independent, orthogonal concepts, resulting in equal penalization of misclassifications regardless of the semantic similarity between the true and predicted labels. Previous studies have attempted to address this by using ontologies to establish relationships among classes, but such data structures are generally not available. In this paper, we use a large language model (LLM) to generate textual descriptions for each class label, aiming to capture the visual characteristics of the corresponding concepts. These descriptions are then encoded into embedding vectors, which are used as the ground truth for training the image classification model. By employing a cosine distance-based loss function, our approach considers the semantic similarity between class labels, encouraging the model to learn a more hierarchically structured internal feature representation. We evaluate our method on multiple datasets and compare its performance with existing techniques, focusing on classification accuracy, mistake severity, and the emergence of a hierarchical structure in the learned concept representations. The results suggest that semantic embedding representations extracted from LLMs have the potential to enhance the performance of image classification models and lead to more semantically meaningful misclassifications. A key advantage of our method, compared to those that leverage explicit hierarchical information, is its broad applicability to a wide range of datasets without requiring the presence of pre-defined hierarchical structures.

Download

Paper Citation

in Harvard Style

Bertolotto S., Panisson A. and Perotti A. (2024). LLM-Generated Class Descriptions for Semantically Meaningful Image Classification. In Proceedings of the 1st International Conference on Explainable AI for Neural and Symbolic Methods - Volume 1: EXPLAINS; ISBN 978-989-758-720-7, SciTePress, pages 50-61. DOI: 10.5220/0013060800003886

in Bibtex Style

@conference{explains24,
author={Simone Bertolotto and André Panisson and Alan Perotti},
title={LLM-Generated Class Descriptions for Semantically Meaningful Image Classification},
booktitle={Proceedings of the 1st International Conference on Explainable AI for Neural and Symbolic Methods - Volume 1: EXPLAINS},
year={2024},
pages={50-61},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013060800003886},
isbn={978-989-758-720-7},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Explainable AI for Neural and Symbolic Methods - Volume 1: EXPLAINS
TI - LLM-Generated Class Descriptions for Semantically Meaningful Image Classification
SN - 978-989-758-720-7
AU - Bertolotto S.
AU - Panisson A.
AU - Perotti A.
PY - 2024
SP - 50
EP - 61
DO - 10.5220/0013060800003886
PB - SciTePress