Authors:
Edoardo Ardizzone
1
;
Marco La Cascia
2
and
Giuseppe Mazzola
1
Affiliations:
1
Università degli Studi di Palermo, Italy
;
2
Università degli studi di palermo, Italy
Keyword(s):
Video Summarization, Keyframe Extraction, Automatic Speech Recognition, YouTube, Multimedia Collections.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Computer Vision, Visualization and Computer Graphics
;
Data Engineering
;
Image and Video Analysis
;
Image Understanding
;
Information Retrieval
;
Object Recognition
;
Ontologies and the Semantic Web
;
Pattern Recognition
;
Software Engineering
;
Video Analysis
;
Web Applications
Abstract:
Keyframe extraction methods aim to find in a video sequence the most significant frames, according to
specific criteria. In this paper we propose a new method to search, in a video database, for frames that are
related to a given keyword, and to extract the best ones, according to a proposed quality factor. We first
exploit a speech to text algorithm to extract automatic captions from all the video in a specific domain
database. Then we select only those sequences (clips), whose captions include a given keyword, thus
discarding a lot of information that is useless for our purposes. Each retrieved clip is then divided into shots,
using a video segmentation method, that is based on the SURF descriptors and keypoints. The sentence of
the caption is projected onto the segmented clip, and we select the shot that includes the input keyword. The
selected shot is further inspected to find good quality and stable parts, and the frame which maximizes a
quality metric is selected as the best an
d the most significant frame. We compare the proposed algorithm
with another keyframe extraction method based on local features, in terms of Significance and Quality.
(More)