Ontology-Driven Deep Learning Model for Multitask Visual Food Analysis

Daniel Ponte, Eduardo Aguilar, Eduardo Aguilar, Mireia Ribera, Petia Radeva, Petia Radeva

2024

Abstract

The food analysis from images is a challenging task that has gained significant attention due to its multiple applications, especially in the field of health and nutrition. Ontology-driven deep learning techniques have shown promising results in improving model performance. Food ontology can leverage domain-specific information to guide model learning and thus substantially enhance the food analysis. In this paper, we propose a new ontology-driven multi-task learning approach for food recognition. To this end, we deal multi-modal information, text and images, in order to extract from the text the food ontology, which represents prior knowledge about the relationship of food concepts at different semantic levels (e.g. food groups and food names), and apply this information to guide the learning of the multi-task model to perform the task at hand. The proposed method was validated on the public food dataset named MAFood-121, specifically on dishes belonging to Mexican cuisine, outperforming the results obtained in single-label food recognition and multi-label food group recognition. Moreover, the proposed integration of the ontology into the deep learning framework allows providing more consistent results across the tasks.

Download


Paper Citation


in Harvard Style

Ponte D., Aguilar E., Ribera M. and Radeva P. (2024). Ontology-Driven Deep Learning Model for Multitask Visual Food Analysis. In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP; ISBN 978-989-758-679-8, SciTePress, pages 624-631. DOI: 10.5220/0012388200003660


in Bibtex Style

@conference{visapp24,
author={Daniel Ponte and Eduardo Aguilar and Mireia Ribera and Petia Radeva},
title={Ontology-Driven Deep Learning Model for Multitask Visual Food Analysis},
booktitle={Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP},
year={2024},
pages={624-631},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012388200003660},
isbn={978-989-758-679-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP
TI - Ontology-Driven Deep Learning Model for Multitask Visual Food Analysis
SN - 978-989-758-679-8
AU - Ponte D.
AU - Aguilar E.
AU - Ribera M.
AU - Radeva P.
PY - 2024
SP - 624
EP - 631
DO - 10.5220/0012388200003660
PB - SciTePress