Predicting Visible Terms from Image Captions using Concreteness and Distributional Semantics

Jean Charbonnier; Christian Wartena

doi:10.5220/0011351400003335

Predicting Visible Terms from Image Captions using Concreteness and Distributional Semantics

Jean Charbonnier, Christian Wartena

2022

Abstract

Image captions in scientific papers usually are complementary to the images. Consequently, the captions contain many terms that do not refer to concepts visible in the image. We conjecture that it is possible to distinguish between these two types of terms in an image caption by analysing the text only. To examine this, we evaluated different features. The dataset we used to compute tf.idf values, word embeddings and concreteness values contains over 700 000 scientific papers with over 4,6 million images. The evaluation was done with a manually annotated subset of 329 images. Additionally, we trained a support vector machine to predict whether a term is a likely visible or not. We show that concreteness of terms is a very important feature to identify terms in captions and context that refer to concepts visible in images.

Download

Paper Citation

in Harvard Style

Charbonnier J. and Wartena C. (2022). Predicting Visible Terms from Image Captions using Concreteness and Distributional Semantics. In Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2022) - Volume 1: KDIR; ISBN 978-989-758-614-9, SciTePress, pages 161-169. DOI: 10.5220/0011351400003335

in Bibtex Style

@conference{kdir22,
author={Jean Charbonnier and Christian Wartena},
title={Predicting Visible Terms from Image Captions using Concreteness and Distributional Semantics},
booktitle={Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2022) - Volume 1: KDIR},
year={2022},
pages={161-169},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011351400003335},
isbn={978-989-758-614-9},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2022) - Volume 1: KDIR
TI - Predicting Visible Terms from Image Captions using Concreteness and Distributional Semantics
SN - 978-989-758-614-9
AU - Charbonnier J.
AU - Wartena C.
PY - 2022
SP - 161
EP - 169
DO - 10.5220/0011351400003335
PB - SciTePress