ing zoom level between the 12
th
and 14
th
seconds.
In particular, the decreasing confidence score (eq. 4)
triggers a motion that center the target (Figure 4.D
2
),
unlike Kalman-based strategy (Figure 4.D
1
). Then,
once the trajectory analyser detects that such unex-
pected motion is a target stop the zoom level rises
again thanks to the exponential speed based term.
However the system can not zoom closer until a pan-
tilt motion allows zoom control (green plot on the
second graphic on Figure 5, between the 15
th
and
20
th
seconds). This is the main drawback from our
method, as we chose to preserve the tracking con-
tinuity over target resolution. Furthermore, the
IMM KF based strategy is also better than the one
based on (Varcheie and Bilodeau, 2011), as shown
on Table 2. Precision is increased by over 30 per-
centage point when centralization is also increased
by 5 percentage point, thus reducing tracking failure
and increasing framerate (Fps). The main drawback
of Varcheie-based third strategy is the motion trigger
that leads to small accumulated motions, decreasing
framerate. For instance many small motions are trig-
gered as the target stops between Figures 4.C
3
and
4.D
3
, while the first strategy does not and the second
only once, to adjust zoom parameter after detecting
the target stop. Furthermore, camera view angle and
scene context may quickly change target appearance
during the 4
th
scenario, preventing motion trigger in
the third strategy, decreasing performances (P) and
(C). Target also goes out of the FoV as trigger con-
dition is not met (Figures 4.B
3
and 4.E
3
) increasing
fragmentation (TF). Finally speed average prediction
may drive the PTZ in a wrong direction, because of
a distractor detection when target goes away from the
FoV, such as in Figure 4.C
3
.
5 CONCLUSIONS
Only a few state-of-the-art systems track a person
with a single IP PTZ camera. This device is subject
to large and variable motion delays, especially off-
the-shelf PTZ that can not be entirely modeled. That
slows down the algorithm and increases the risk of
losing the target during camera motion. Our ap-
proach is focused on managing these delays through a
perception-prediction-action strategy relying on three
innovativefeatures. First, an improvedprediction step
updates and anticipates target position such that the
camera is centered on the target at the end of its mo-
tion. We improved prediction performances with an
Interacting Multiple Model Kalman filter which is
more resilient to abrupt motion change, improving
pan-tilt control accuracy. This prediction filter also
gives a probabilistic estimation of the prediction re-
liability that allows a trajectory enhanced zoom con-
trol. Camera motion order is therefore more accurate
and possibly corrected by an interruption module that
takes advantage of camera control latency. Further-
more, this strategy can be used with most of track-
ing algorithms that return a target position probability
and requires almost no computational time to process.
Experiments we led demonstrate that our strategy
performs well on typical tracking situations. Espe-
cially our IMM KF based prediction is more efficient
than the one based on Kalman filter and leads less of-
ten to failure in case of unexpected trajectory breaks.
Then we also show that our innovations improve ro-
bustness to context and motion change compared to
the state-of-the-art method (Varcheie and Bilodeau,
2011) which shares a similar perception-prediction-
action strategy. Further investigations will focus
on increasing zoom control performance, in particu-
lar to increase reactivity to target behaviour. Then we
will apply our monocular approach to collaborative
PTZ network with partially common FoV.
REFERENCES
Ahmed, J., Ali, A., and Khan, A. (2012). Stabilized ac-
tive camera tracking system. In Journal of Real-Time
Image Processing.
Al Haj, M., Bagdanov, A., Gonzalez, J., and Roca, F.
(2010). Reactive object tracking with a single ptz cam-
era. In Pattern Recognition (ICPR), 2010 20th Inter-
national Conference on, pages 1690–1693.
Bellotto, N., Sommerlade, E., Benfold, B., Bibby, C., Reid,
I., Roth, D., Fernandez, C., Van Gool, L., and Gonza-
lez, J. (2009). A distributed camera system for multi-
resolution surveillance. In Distributed Smart Cam-
eras, 2009. ICDSC 2009. Third ACM/IEEE Interna-
tional Conference on, pages 1–8.
Bernardin, K. and Stiefelhagen, R. (2008). Evaluating mul-
tiple object tracking performance: the clear mot met-
rics. J. Image Video Process., 2008:1:1–1:10.
Chang, F., Zhang, G., Wang, X., and Chen, Z. (2010). Ptz
camera target tracking in large complex scenes. In In-
telligent Control and Automation (WCICA), 2010 8th
World Congress on.
Choi, H., Park, U., Jain, A., and Lee, S. (2011). Face track-
ing and recognition at a distance : A coaxial & con-
centric ptz camera system. In IEEE Transactions on
Circuits and systems for video technology.
Dalal, N. and Triggs, B. (2005). Histograms of oriented gra-
dients for human detection. In Computer Vision and
Pattern Recognition, IEEE Computer Society Confer-
ence on.
Dinh, T., Qian, Y., and Medioni, G. (2009). Real time
tracking using an active pan-tilt-zoom network cam-
era. In IEEE/RSJ International Conference on Intelli-
gent Robots and Systems (IROS).
Perception-prediction-controlArchitectureforIPPan-Tilt-ZoomCamerathroughInteractingMultipleModels
323