Surgical Phase Recognition of Short Video Shots based on Temporal Modeling of Deep Features

Constantinos Loukas

Abstract

Recognizing the phases of a laparoscopic surgery (LS) operation form its video constitutes a fundamental step for efficient content representation, indexing and retrieval in surgical video databases. In the literature, most techniques focus on phase segmentation of the entire LS video using hand-crafted visual features, instrument usage signals, and recently convolutional neural networks (CNNs). In this paper we address the problem of phase recognition of short video shots (10s) of the operation, without utilizing information about the preceding/forthcoming video frames, their phase labels or the instruments used. We investigate four state-of-the-art CNN architectures (Alexnet, VGG19, GoogleNet, and ResNet101), for feature extraction via transfer learning. Visual saliency was employed for selecting the most informative region of the image as input to the CNN. Video shot representation was based on two temporal pooling mechanisms. Most importantly, we investigate the role of ‘elapsed time’ (from the beginning of the operation), and we show that inclusion of this feature can increase performance dramatically (69% vs. 75% mean accuracy). Finally, a long short-term memory (LSTM) network was trained for video shot classification, based on the fusion of CNN features and ‘elapsed time’, increasing the accuracy to 86%. Our results highlight the prominent role of visual saliency, long-range temporal recursion and ‘elapsed time’ (a feature ignored so far), for surgical phase recognition.

Download


Paper Citation


in Harvard Style

Loukas C. (2019). Surgical Phase Recognition of Short Video Shots based on Temporal Modeling of Deep Features.In Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: BIOIMAGING, ISBN 978-989-758-353-7, pages 21-29. DOI: 10.5220/0007352000210029


in Bibtex Style

@conference{bioimaging19,
author={Constantinos Loukas},
title={Surgical Phase Recognition of Short Video Shots based on Temporal Modeling of Deep Features},
booktitle={Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: BIOIMAGING,},
year={2019},
pages={21-29},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007352000210029},
isbn={978-989-758-353-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: BIOIMAGING,
TI - Surgical Phase Recognition of Short Video Shots based on Temporal Modeling of Deep Features
SN - 978-989-758-353-7
AU - Loukas C.
PY - 2019
SP - 21
EP - 29
DO - 10.5220/0007352000210029