been proposed to tackle the problem of automati-
cally building a training dataset in order to exploit
large amount of images recorded by cameras. Semi-
supervised methods are often part of the proposed so-
lutions in bibliography, because they are designed to
use, directly in the training set, labeled but especially
unlabeled data.
The training dataset is called contextualized when
it contains a lot of specifics information coming from
the scene. The data could be integrated in the special-
ized classifier in several ways:
• collecting a large database to train a one shot clas-
sifier is the principle of offline methods;
• training the classifier as soon as new samples
are available, is the principle of online methods.
These latter have been generalized in computer vi-
sion by (Grabner and Bischof, 2006).
Our goal is to propose a new semi-supervised
method. Using an oracle will permit to automati-
cally build a classifier which will be adapted to the
particular context of the scene. We choose to train
our detector with an offline method for two reasons.
Firstly our procedure occurs at the time of a camera
network installation. Although we have all necessary
time to obtain and treat a lot of examples, we prefer to
avoid training an online classifier during exploitation
and keeps all computer resources for detections. Sec-
ondly even if there are some online strong methods
(Leistner et al., 2009), there is still a risk of drifting
that seems not compatible with a long-term use.
In this study, we focus on how to build an oracle.
After having detailed the most used semi-supervised
methods, we describe, in the third part our strategy to
create the oracle. The part 4 presents an evaluation of
the proposed process consisting in an analysis of the
behaviour of the oracle and a comparison with a state
of art classifier.
2 STATE OF ART
There are a lot of families of semi-supervised meth-
ods. The most common approaches are the self-
learning ones, the co-training ones and the methods
based on an oracle.
The self-learning (Rosenberg et al., 2005) ap-
proach consists in using the output of a classifier to
annotate a new example. If a classifier is very confi-
dent about a sample, this latter is added to the base.
This method lacks of robustness suffering from a drift
problem. Mislabeled examples will indeed disrupt the
classifier, change its behaviour for the next samples
and in consequence make the phenomenon worse.
Moreover if the confident threshold used to separate
classes is too low, a lot of false positives will be incor-
porated in the base. On the contrary if the threshold is
too high, only perfectly identified samples, the ones
containing little information, are kept.
The co-training introduced by (Blum and
Mitchell, 1998) is a formalism in which two classi-
fiers are trained in parallel. Each of them uses a dif-
ferent and independent part of the data. For exam-
ple (Levin et al., 2003) train two classifiers, one on
appearance signal and the other one on background
subtraction signal. The co-training algorithm uses the
fact that an example must have the same label with
both classifiers even if they are not trained on the same
data. If one of the detectors labels with confidence a
sample, the other one being unsure, the sample is in-
corporated in the base of the second classifier. During
the training phase, each classifier improves its per-
formance thanks to the confidence of the other one.
Endly we obtain two well trained detectors. Even if
detectors are independent, the problem here is, like
with the self-learning, the outputs of the classifiers are
still directly used to label samples. Drift problem are
not completely excluded because parts of the data are
seldom independent.
Methods based on an oracle use an external en-
tity to build a dataset. This entity annotates all ex-
amples before adding them in the training data. Fi-
nal detector does not affect the outputs of the oracle
reducing the drift problem. The capacity of an ora-
cle to find good samples without error determines the
performance of the final classifier. If the oracle does
not work well on a video the whole system is useless.
A lot of different classifiers have already been pro-
posed. (Wu, 2008) uses a part based classifier applied
on appearance signal. If the oracle find some pedes-
trian parts, the sample is added in the training data.
One drawback of the method is the fact that the or-
acle is composed of only one classifier dealing with
only one signal. Another problem is the difficulty of
detecting pedestrian parts and merging them. To add
robustness, (Stalder et al., 2009) uses an oracle with
several stages. First step consists in detecting people
in the picture. In a second part trackers are initialized
on this detection. The author’s goal is to obtain some
spatio-temporal continuity between oracle detections
to incorporate samples which have not been detected.
Contrary to Wu’s approach, this allows to find some
hard examples. A last stage uses 3D information. The
main drawback of this scheme is its structure. If a
stage failed, errors are inevitably passed to the next
one without any possibility to correct them.
We propose an oracle working in a no-sequential
way in order to improve robustness.
VISAPP 2012 - International Conference on Computer Vision Theory and Applications
514