Classification, Localization and Captioning of Dangerous Situations using Inception-v3 Network and CAM

Sichen Zhang, Axel Heßler, Ming Zhang

2020

Abstract

An early situation assessment is an important aspect during emergency missions and provides useful information for fast decision making. However, many situations can be dangerous and visually hard to analyze due to the complexity. With the recent development in the field of artificial intelligence and computer vision there exists a wide range of application possibilities including automatic situation detection. However, many related works focused either on event captioning or on dangerous object detection. Therefore in this paper, a novel approach for simultaneous recognition and localization of dangerous situation is proposed: Two different CNN architectures are used, whereas one of the CNN, the Inception-v3, is modified to generate Class Activation Map (CAM). With CAM it is possible to generate bounding boxes for recognized objects without being explicitly trained for it. This eliminates the need for large image dataset with manually annotated boxes. The information about the detected objects from both networks, their spatial-relationships and the severity of the situation are then analyzed in the situation detection module. The detected situation is finally summarized in a short description and made available for the emergency managers to support them in fast decision makings.

Download


Paper Citation


in Harvard Style

Zhang S., Heßler A. and Zhang M. (2020). Classification, Localization and Captioning of Dangerous Situations using Inception-v3 Network and CAM. In Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-395-7, pages 48-57. DOI: 10.5220/0008911800480057


in Bibtex Style

@conference{icaart20,
author={Sichen Zhang and Axel Heßler and Ming Zhang},
title={Classification, Localization and Captioning of Dangerous Situations using Inception-v3 Network and CAM},
booktitle={Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2020},
pages={48-57},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008911800480057},
isbn={978-989-758-395-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Classification, Localization and Captioning of Dangerous Situations using Inception-v3 Network and CAM
SN - 978-989-758-395-7
AU - Zhang S.
AU - Heßler A.
AU - Zhang M.
PY - 2020
SP - 48
EP - 57
DO - 10.5220/0008911800480057