Learning to Predict Video Saliency using Temporal Superpixels

Anurag Singh, Chee-Hung Henry Chu, Michael A. Pratt


Visual Saliency of a video sequence can be computed by combining spatial and temporal features that attract a user’s attention to a group of pixels. We present a method that computes video saliency by integrating these features: color dissimilarity, objectness measure, motion difference, and boundary score. We use temporal clusters of pixels, or temporal superpixels, to simulate attention associated with a group of moving pixels in a video sequence. The features are combined using weights learned by a linear support vector machine in an online fashion. The temporal linkage for superpixels is then used to find the saliency flow across the image frames. We experimentally demonstrate the efficacy of the proposed method and that the method has better performance when compared to state-of-the-art methods.


