Table 2: Comparison to SOA methods.
method POM
Our method
Precision 87.20 97.5 91.28
Recall 95.56 95.5 95.01
evaluated on EPFL and PETS sequences (Utasi and
Benedek, 2011), 395 frames with 1554 objects
evaluated on EPFL sequence, 179 frames with 661
Table 3: Average processing time of steps (4 views, single
threaded implementation, 2.4GHz Core 2 Quad CPU).
foreground detection 51.2ms 32.5ms
forming cones 3.43ms 4.87ms
matching/detection 907us 618us
threaded implementation.
However, foreground detection and forming cones
can be done independently for views, on multicore
platforms or even on smart cameras. Matching and
detection requires all cone information, but is ex-
tremely fast, real-time processing would still be pos-
sible with more views.
Many methods, including POM and 3DMPP,
project parts or whole foreground masks to planes,
which is computationally expensive, and distributing
computation is not possible due to data dependencies.
We proposed a multiview-detection algorithm that re-
tracts 3D position of people using multiple calibrated
and synchronized views. In our case, unlike other al-
gorithms, non-planar ground can be present. This is
done by modeling possible positions of feet with 3D
primitives, cones in scene space and searching for in-
tersections of these cones.
For good precision, height map of ground should
be known. Our method can compute height map on
the fly, reaching high precision after a startup time.
After height map detection we measured preci-
sion and recall values comparable to SOA methods
on commonly used data set. Our algorithm worked
well also on our test videos we made to demonstrate
capabilities of handling non-planar ground.
In the future we plan to examine tracking people
by their leaning leg positions(Havasi et al., 2007).
