the process here has been described for image se-
quences, the method is also able to process raw video
data. In this case only one image per second is consid-
ered (so one every 25 for standard video framerate).
The proposed strategy leads to a significant reduc-
tion of deformation accumulation among time, and al-
lows for assembling a large number of images. But
some deformation problems remain, that are related
to the limited functionalities of the BoofCV API and
the use of a simple homography matrix. We have thus
considered OpenCV as a possible alternative solution.
2.3 Image Stitching with OpenCV
The OpenCV API (Bradski, 2000; Itseez, 2015) of-
fers more stitching options. It is thus possible to di-
rectly build a mosaic from several images (i.e., more
than two). Besides, a wider range of warpers (plane,
cylindrical, spherical, fisheye, stereographic, etc.) are
available. The homography matrix is then replaced
by a transformation function that is specific to each
warper. If several images are processed, deformations
are computed on all the images, thus limiting the cas-
caded deformations of an overlapped image part.
Figure 11 (bottom) shows that performing image
stitching with OpenCV provides correct results when
assembling aerial images. However, some assembling
problems remain (see Fig. 3).
Figure 3: From left to right, a zoom on two images to be
stitched, and the assembling error with double occurrences
of humans (highlighted with white frames).
3 SHELLFISH GATHERER
DETECTION
3.1 Human Detection
3.1.1 Related Work
Human detection is a widely addressed computer vi-
sion problem, often associated to video monitoring
or pedestrian detection. Detecting human gathering
shellfish, share many similarities with the well-known
and widely addressed problem of pedestrian detec-
tion. However some differences remain, that make
the fishermen detection a challenging issue for which
there is no available solution yet (to the best of our
knowledge). Among the most important ones are the
wide range of body positions that can be observed
among the fishermen during their activity, as well as
the unconstrained acquisition conditions (photo man-
ually taken from an airplane). Nevertheless, we will
rely here on a standard object detection scheme comb-
ing image description with machine learning.
A lot of works have been achieved on pedestrian
detection or more generally object detection. Proba-
bly the most popular solution is the face detector in-
troduced by Viola and Jones (Viola and Jones, 2001),
that combines Haar wavelet features with Adaboost
classifier. As described in several survey papers in
pedestrian detection (Dollár et al., 2009; Benenson
et al., 2014), improving the detection rate requires to
improve both feature detection and machine learning
algorithms. The HOG descriptor (Dalal and Triggs,
2005) has been one of the major advances on the fea-
ture description side for human detection. It has been
implemented in the OpenCV API with a linear SVM
classifier, and is a main feature used in many detec-
tors. When available, the detection can benefit from
complementary description sources, e.g., related to
motion or stereo-vision information.
While many approaches have been introduced to
solve the pedestrian detection problem, only few pa-
pers tackled it from aerial images. These two prob-
lems show significant differences, especially since the
pedestrian detection is often achieved through near-
horizontal cameras, while aerial detection rather con-
sider either vertical (top-down) or oblique images.
This prevents from straightforward transfer of the rich
state-of-the-art in pedestrian detection methods. Nev-
ertheless, a few works on human detection from aerial
imagery have been published. For instance, a shadow
detector is presented in (Reilly et al., 2010) but its
application is limited by the weather conditions (it re-
quires sun illumination, and imposes constraints on
the camera viewpoint). A part-based model for victim
detection from UAV is described in (Andriluka et al.,
2010). Finally, the human detection from a UAV view
is explored in (Blondel and Potelle, 2014), where the
optimal acquisition angle to improve detection from
aerial images is discussed. The authors also introduce
an adaptation of HOG parameters to human detection
and a saliency map to increase the detection speed.
The focus is mainly on real time detection and there
is no quantitative evaluation of the detection accuracy.
3.1.2 The Case of Shellfish Gatherers
We perform here human detection on each single
aerial image or mosaic and thus we do not rely on
any motion information. In this context, detection of
shellfish gatherers is a challenging problem (see Fig. 4
VISAPP 2016 - International Conference on Computer Vision Theory and Applications
666