tent and the corresponding logistic problems in the
recording of these contents and human supervision,
advertising agencies are requiring the automating of
many of these processes or at least the development
of tools than can alleviate the load of the expert. Al-
though the final decision is taken by the human expert,
any application that can reduce the time dedicated by
the expert to the supervision is very useful.
In the field of TV commercials, there are two main
different problems. One is the identification of the
commercial breaks. This is done by the detection of
some audio and video features such as blank screens,
change in the volume and some other discriminating
characteristics. In the literature we can find many al-
gorithms to detect the commercial breaks; see, e.g.,
(Lienhart et al., 1997), (Satterwhite and Marques,
2004) or (Zhang et al., 2012).
The second problem is the analysis of the com-
mercial. In this case, the pattern recognition problem
can be stated in different ways. The most basic one
is the detection (identification) of a known commer-
cial to assure that the campaign is broadcasted accord-
ing to the terms of the agreement with the publicity
agency (Duan et al., 2006). Another one is an un-
supervised classification approach, i.e., there is not a
target advertisement to look for but some clusters that
can represent different classes of commercials, e.g.,
attending to their content (Hua et al., 2009).
In this paper, we address the detection problem or
commercial retrieval, i.e., multimedia search in long
recordings of broadcasted TV signals by a given com-
mercial query. We will assume that the commercial
breaks are correctly localized in the time domain and
the goal is the identification of some known commer-
cials in the recorded segments in a fast and effective
way.
The key point, as in most pattern recognitionprob-
lems, is to find a domain where the classes are sepa-
rable or, in a more realistic approach, to find out the
most discriminating features according to the prob-
lem under consideration. In the case of TV commer-
cials, we have two different kinds of signals: the au-
dio and video information. Since we are interested
on a system that works rapidly and not demanding
too many resources in terms of memory and compu-
tational load, we will explore in this paper a solution
based on the audio features.
2 DETECTION OF TV
COMMERCIALS BASED
ON AUDIO FEATURES
The system is composed of three stages: in the first
one, the recorded broadcasted signal is preprocessed
in order to reduce its length and to obtain an input sig-
nal that is composed only of commercial breaks; in
the second one, the descriptor of the query commer-
cial is calculated and, in the third one, the detector is
applied.
2.1 Preprocessing
The data come from real recordings of broadcasted
TV signal. The first task is the extraction of the com-
mercials. As we mentioned, there are many algo-
rithms to carry out this work. We will assume that it is
done in advance. In our case, we use Comskip (Com-
skip, 2012), a free MPEG commercial detector. It is a
windows console application that reads a MPEG file
and using information related to logo, black frames,
silences, changes in aspect ratio and so on is able to
indicate the time where a commercial break starts and
ends.
Comskip is a configurable software, so some pa-
rameters must be set previously by trial and error de-
pending on the broadcaster, i.e., TV channel. We use
a set of parameters that pursue the goal that no com-
mercial fragment is missing. In order to reassure this,
we add a minute of broadcasting before and after to
every commercial break detected by Comskip. This
implies that the input data can contain some content
that does not correspond to advertising. This is a trade
off between the rapidness of the detection procedure
and the overall system type II error or false negatives,
i.e., commercials that are not detected in the broad-
casting.
Finally, the different time periods of advertising
breaks are concatenated obtaining a signal that is
composed mostly of commercials x[n]. We will use
only the audio part of the commercial, reducing the
storage requirements and the time of computation.
Some systems to detect and recognize commercials
using the video content can be found in the literature,
such as (Putpuek et al., 2010) or (Wu et al., 2010).
Note that the duration of commercials on TV
range from a few seconds to a minute or even more in
exceptional circumstances, depending on each coun-
try, broadcaster, time of the day or TV program. Con-
sidering the audio content, the diversity we can find is
enormous. Some are based on people talking; some
others are based on music; even there may be silent
in very few occasions. This diversity means that,
SIGMAP2013-InternationalConferenceonSignalProcessingandMultimediaApplications
66