Clustering-based Model for Predicting Multi-spatial Relations in Images

Brandon Birmingham, Adrian Muscat

Abstract

Detecting spatial relations between objects in an image is a core task in image understanding and grounded natural language. This problem has been addressed in cognitive linguistics through the development of template and computational models from controlled experimental data using 2D or 3D synthetic diagrams. Furthermore, the Computer Vision (CV) and Natural Language Processing (NLP) communities developed machine learning models for real-world images mostly from crowd-sourced data. The latter models treat the problem as a single label classification problem, whereas the problem is inherently a multi-label problem. In this paper, we learn a multi-label model based on computed spatial features. We choose to implement the model using a clustering-based approach, since apart from predicting multi-labels for a given instance, this method would allow us to get deeper insights into how spatial relations are related to each other. In this paper, we report our results from this model and a direct comparison with a Random Forest single label classifier is presented. The proposed model shows that in general it outperforms the single label classifier even when considering the top four prepositions predicted by the single label classifier.

Download


Paper Citation


in Harvard Style

Birmingham B. and Muscat A. (2019). Clustering-based Model for Predicting Multi-spatial Relations in Images.In Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-989-758-380-3, pages 147-156. DOI: 10.5220/0008123601470156


in Bibtex Style

@conference{icinco19,
author={Brandon Birmingham and Adrian Muscat},
title={Clustering-based Model for Predicting Multi-spatial Relations in Images},
booktitle={Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2019},
pages={147-156},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008123601470156},
isbn={978-989-758-380-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - Clustering-based Model for Predicting Multi-spatial Relations in Images
SN - 978-989-758-380-3
AU - Birmingham B.
AU - Muscat A.
PY - 2019
SP - 147
EP - 156
DO - 10.5220/0008123601470156