Authors:
Saptakatha Adak
and
Sukhendu Das
Affiliation:
Visualization and Perception Lab, Indian Institute of Technology Madras, Chennai - 600036 and India
Keyword(s):
Video Object Segmentation, Generative Adversarial Network (GAN), Deep Learning.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Image and Video Analysis
;
Motion, Tracking and Stereo Vision
;
Segmentation and Grouping
;
Tracking and Visual Navigation
Abstract:
This paper studies the problem of Video Object Segmentation which aims at segmenting objects of interest throughout entire videos, when provided with initial ground truth annotation. Although, variety of works in this field have been done utilizing Convolutional Neural Networks (CNNs), adversarial training techniques have not been used in spite of their effectiveness as a holistic approach. Our proposed architecture consists of a Generative Adversarial framework for the purpose of foreground object segmentation in videos coupled with Intersection-over-union and temporal information based loss functions for training the network. The main contribution of the paper lies in formulation of the two novel loss functions: (i) Inter-frame Temporal Symmetric Difference Loss (ITSDL) and (ii) Intra-frame Temporal Loss (IFTL), which not only enhance the segmentation quality of the predicted mask but also maintain the temporal consistency between the subsequent generated frames. Our end-to-end tra
inable network exhibits impressive performance gain compared to the state-of-the-art model when evaluated on three popular real-world Video Object Segmentation datasets viz. DAVIS 2016, SegTrack-v2 and YouTube-Objects dataset.
(More)