struction from video footage (Sch
¨
oning, 2015) and
for video inpainting. For other applications like the
analysis of gaze data (Kurzhals et al., 2014b), which
mostly depends on a rectangular description of the
AOI, this rectangular description can easily be derived
from the polygon description.
To overcome the still existing restrictions of our
implementation, we will extend the activity diagram
(Fig. 3) with components which detect deformation
and rotation. Thus the affine transformation of the
AOI can be improved significantly.
The current prototype of iSeg is GPLv3 licensed
and available online
2
.
ACKNOWLEDGEMENTS
This work was funded by the German Research Foun-
dation (DFG) as part of the Scalable Visual Analytics
Priority Program (SPP 1335).
REFERENCES
Alt, H. and Guibas, L. J. (1996). Discrete geometric shapes:
Matching, interpolation, and approximation: A sur-
vey. Technical report, Handbook of Computational
Geometry.
Boykov, Y., Veksler, O., and Zabih, R. (2001). Fast approxi-
mate energy minimization via graph cuts. IEEE Trans.
Pattern Anal. Mach. Intell., 23(11):1222–1239.
Caselles, V., Kimmel, R., and Sapiro, G. (1997). Geodesic
active contours. Int J Comput Vision, 22(1):61–79.
Cobos, F. and Peetre, J. (1991). Interpolation of compact
operators: the multidimensional case. Proc. Lond.
Math. Soc., 3(2):371–400.
Dasiopoulou, S., Giannakidou, E., Litos, G., Malasioti, P.,
and Kompatsiaris, Y. (2011). A survey of semantic
image and video annotation tools. Lect Notes Comput
Sc, pages 196–239.
Doermann, D. and Mihalcik, D. (2000). Tools and tech-
niques for video performance evaluation. Interna-
tional Conference on Pattern Recognition, 4:167 –
170.
Ferryman, J. and Shahrokni, A. (2009). PETs2009: Dataset
and challenge. IEEE International Workshop on Per-
formance Evaluation of Tracking and Surveillance.
Gotsman, C. and Surazhsky, V. (2001). Guaranteed
intersection-free polygon morphing. Comput Graph,
25(1):67–75.
H
¨
oferlin, B., H
¨
oferlin, M., Heidemann, G., and Weiskopf,
D. (2015). Scalable video visual analytics. Inf Vis,
14(1):10–26.
2
Source code and binaries for Ubuntu, Mac OS X and
Windows: https://ikw.uos.de/∼cv/projects/iSeg
Kurzhals, K., Bopp, C. F., B
¨
assler, J., Ebinger, F., and
Weiskopf, D. (2014a). Benchmark data for evaluating
visualization and analysis techniques for eye tracking
for video stimuli. Workshop on Beyond Time and Er-
rors Novel Evaluation Methods for Visualization.
Kurzhals, K., Heimerl, F., and Weiskopf, D. (2014b).
Iseecube: visual analysis of gaze data for video. Sym-
posium on Eye Tracking Research and Applications,
pages 43–50.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. Int J Comput Vision, 60(2):91–
110.
Muja, M. and Lowe, D. G. (2009). Fast approximate near-
est neighbors with automatic algorithm configuration.
International Conference on Computer Vision Theory
and Applications, 2:331–340.
Multimedia Knowledge and Social Media Analytics Lab-
oratory (2015). Video image annotation tool.
http://mklab.iti.gr/project/via.
Rother, C., Kolmogorov, V., and Blake, A. (2004). “Grab-
Cut” interactive foreground extraction using iterated
graph cuts. ACM Trans Graph, 23(3):309–314.
Rublee, E., Rabaud, V., Konolige, K., and Bradski, G.
(2011). ORB: an efficient alternative to SIFT or
SURF. IEEE International Conference on Computer
Vision, pages 2564–2571.
Sch
¨
oning, J. (2015). Interactive 3D reconstruction: New op-
portunities for getting cad-ready models. In Imperial
College Computing Student Workshop, volume 49,
pages 54–61. Schloss Dagstuhl–Leibniz-Zentrum fuer
Informatik.
Sch
¨
oning, J., Faion, P., and Heidemann, G. (2015). Semi-
automatic ground truth annotation in videos: An in-
teractive tool for polygon-based object annotation and
segmentation. In International Conference on Knowl-
edge Capture, pages 17:1–17:4. ACM, New York.
Schroeter, R., Hunter, J., and Kosovic, D. (2003). Vannotea
- A collaborative video indexing, annotation and dis-
cussion system for broadband networks. Workshop on
Knowledge Markup & Semantic Annotation,pages 1–8
Shneiderman, B. (1984). Response time and display rate
in human performance with computers. ACM Comput
Surv, 16(3):265–285.
Tanisaro, P., Sch
¨
oning, J., Kurzhals, K., Heidemann, G.,
and Weiskopf, D. (2015). Visual analytics for video
applications. it-Information Technology, 57:30–36.
Wu, S., Zheng, S., Yang, H., Fan, Y., Liang, L., and Su, H.
(2014). Sagta: Semi-automatic ground truth annota-
tion in crowd scenes. IEEE International Conference
on Multimedia and Expo Workshosps.
Yao, A., Gall, J., Leistner, C., and Van Gool, L. (2012).
Interactive object detection. International Conference
on Pattern Recognition, pages 3242–3249.
YouTube (2015). Statistics - youtube: https://www.
youtube.com/yt/press/statistics.html.
Pixel-wise Ground Truth Annotation in Videos - An Semi-automatic Approach for Pixel-wise and Semantic Object Annotation
697