Small Vocabulary with Saliency Matching for Video Copy Detection

Huamin Ren, Thomas B. Moeslund, Sheng Tang, Heri Ramampiaro


The importance of copy detection has led to a substantial amount of research in recent years, among which Bag of visual Words (BoW) plays an important role due to its ability to effectively handling occlusion and some minor transformations. One crucial issue in BoW approaches is the size of vocabulary. BoW descriptors under a small vocabulary can be both robust and efficient, while keeping high recall rate compared with large vocabulary. However, the high false positives exists in small vocabulary also limits its application. To address this problem in small vocabulary, we propose a novel matching algorithm based on salient visual words selection. More specifically, the variation of visual words across a given video are represented as trajectories and those containing locally asymptotically stable points are selected as salient visual words. Then we attempt to measure the similarity of two videos through saliency matching merely based on the selected salient visual words to remove false positives. Our experiments show that a small codebook with saliency matching is quite competitive in video copy detection. With the incorporation of the proposed saliency matching, the precision can be improved by 30% on average compared with the state-of-the-art technique. Moreover, our proposed method is capable of detecting severe transformations, e.g. picture in picture and post production.


