one detection step, that is less time consuming. In-
deed, our tracker is initialized by a detection step
and models are never updated, proving the efficiency
of the method. We also show our tracking can be
embedded into a complete framework dedicated to
text removal. If future works, we plan to overcome
the limitation of our approach and allow to manage
stronger transformations. A simple solution is to add
frequent detections to update the model(s). Another
solution is to extend the state space (add scale param-
eters for example). Usually, increasing the state space
means increasing the number of particles to achieve
good tracking performances. But, because tangent
distance can also handle small transformations, state
space should be sampled more coarsely, then using
fewer particles. We think that, even if we increase the
state space dimension, we probably will need fewer
particles to achieve lower tracking errors, and then
also reduce the computation times.
REFERENCES
Bertalmio, M., Bertozzi, A. L., and Sapiro, G. (2001).
Navier-stokes, fluid dynamics, and image and video
inpainting. In CVPR, pages 355–362.
Bhattacharyya, A. (1943). On a measure of divergence
between two statistical populations defined by their
probability distributions. Bulletin of Cal. Math. Soc.,
35(1):99–109.
Brasnett, P. and Mihaylova, L. (2007). Sequential monte
carlo tracking by fusing multiple cues in video se-
quences. Image and Vision Computing, 25(8):1217–
1227.
Breitenstein, M. D., Reichlin, F., Leibe, B., Koller-Meier,
E., and Gool, L. V. (2009). Robust tracking-by-
detection using a detector confidence particle filter. In
ICCV, pages 1515–1522.
Dalal, N. and Triggs, B. (2005). Histograms of oriented
gradients for human detection. In CVPR, pages 886–
893.
Fabrizio, J., Dubuisson, S., and B´er´eziat, D. (2012). Motion
compensation based on tangent distance prediction for
video compression. Signal Processing: Image Com-
munication, 27(2):153–171.
Fabrizio, J., Marcotegui, B., and M.Cord (2013). Text de-
tection in street level image. Pattern Analysis and Ap-
plications, 16(4):519–533.
Fontmarty, M., Lerasle, F., and Danes, P. (2009). Likelihood
tuning for particle filter in visual tracking. In ICIP,
pages 4101–4104.
Gordon, N., Salmond, D., and Smith, A. (1993). Novel ap-
proach to nonlinear/non-Gaussian Bayesian state esti-
mation. IEE Proc. of Radar and Signal Processing,
140(2):107–113.
ICDAR (2013). ICDAR 2013 robust reading competition,
challenge 3: text localization in video.
http://dag.
cvc.uab.es/icdar2013competition/?ch=3
.
Levillain, R., Geraud, T., and Najman, L. (2010). Why and
howto design a generic and efficient image processing
framework: The case of the milena library. In ICIP,
pages 1941–1944.
Lichtenauer, J., Reinders, M., and Hendriks, E. (2004). In-
fluence of the observation likelihood function on ob-
ject tracking performance in particle filtering. In FG,
pages 227–233.
Lucas, S. M. (2005). Text locating competition results. In
Proceedings of the Eighth International Conference
on Document Analysis and Recognition, ICDAR ’05,
pages 80–85.
Macherey, W., Keysers, D., Dahmen, J., and Ney, H.(2001).
Improving automatic speech recognition using tangent
distance. In ECSCT, volume III, pages 1825–1828.
Mariani, R. (2002). A face location and recognition sys-
tem based on tangent distance. In Multimodal inter-
face for human-machine communication, pages 3–31.
World Scientific Publishing Co., Inc.
Medeiros, H., Holgun, G., Shin, P. J., and Park, J. (2010).
A parallel histogram-based particle filter for object
tracking on simd-based smart cameras. Computer Vi-
sion and Image Understanding, (11):1264–1272.
Merino, C. and Mirmehdi, M. (2007). A framework towards
real-time detection and tracking of text. In CBDAR,
pages 10–17.
Minetto, R., Thome, N., Cord, M., Leite, N. J., and Stolfi, J.
(2011). Snoopertrack: Text detection and tracking for
outdoor videos. In ICIP, pages 505–508.
Phan, T. Q., Shivakumara, P., Lu, T., and Tan, C. L. (2013).
Recognition of video text through temporal integra-
tion. In ICDAR, pages 589–593.
Schwenk, H. and Milgram, M. (1996). Constraint tangent
distance for on-line character recognition. In ICPR,
pages 515–519.
Simard, P., LeCun, Y., Denker, J., and Victorri, B. (1992).
An efficient algorithm for learning invariances in
adaptive classifiers. In ICPR, pages 651–655.
Tanaka, M. and Goto, H. (2008). Text-tracking wearable
camera system for visually-impaired people. In ICPR,
pages 1–4.
Tuong, N. X., M¨uller, T., and Knoll, A. (2011). Robust
pedestrian detection and tracking from a moving vehi-
cle. In Proceedings of the SPIE, volume 7878, pages
1–13.
VISAPP2015-InternationalConferenceonComputerVisionTheoryandApplications
276