TempSeg-GAN: Segmenting Objects in Videos Adversarially using Temporal Information

Saptakatha Adak, Sukhendu Das

Abstract

This paper studies the problem of Video Object Segmentation which aims at segmenting objects of interest throughout entire videos, when provided with initial ground truth annotation. Although, variety of works in this field have been done utilizing Convolutional Neural Networks (CNNs), adversarial training techniques have not been used in spite of their effectiveness as a holistic approach. Our proposed architecture consists of a Generative Adversarial framework for the purpose of foreground object segmentation in videos coupled with Intersection-over-union and temporal information based loss functions for training the network. The main contribution of the paper lies in formulation of the two novel loss functions: (i) Inter-frame Temporal Symmetric Difference Loss (ITSDL) and (ii) Intra-frame Temporal Loss (IFTL), which not only enhance the segmentation quality of the predicted mask but also maintain the temporal consistency between the subsequent generated frames. Our end-to-end trainable network exhibits impressive performance gain compared to the state-of-the-art model when evaluated on three popular real-world Video Object Segmentation datasets viz. DAVIS 2016, SegTrack-v2 and YouTube-Objects dataset.

Download


Paper Citation


in Harvard Style

Adak S. and Das S. (2019). TempSeg-GAN: Segmenting Objects in Videos Adversarially using Temporal Information.In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, ISBN 978-989-758-354-4, pages 221-232. DOI: 10.5220/0007254302210232


in Bibtex Style

@conference{visapp19,
author={Saptakatha Adak and Sukhendu Das},
title={TempSeg-GAN: Segmenting Objects in Videos Adversarially using Temporal Information},
booktitle={Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,},
year={2019},
pages={221-232},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007254302210232},
isbn={978-989-758-354-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP,
TI - TempSeg-GAN: Segmenting Objects in Videos Adversarially using Temporal Information
SN - 978-989-758-354-4
AU - Adak S.
AU - Das S.
PY - 2019
SP - 221
EP - 232
DO - 10.5220/0007254302210232