ning of the first large red block in the tracking view.
Hence, we knowthat the jump is caused by a tracking-
lost event (which is expected and correct). If, how-
ever, the jump corresponded to a tracking state (blue
in the model state view), this would have shown se-
vere tracking problems, as the tracking would have
created jumps (not in line with our knowledge of the
studied phenomenon) and would have marked these
as valid tracked frames.
The frame data view shows the amplitude, depth,
and point cloud data acquired from the ToF camera
for the frame selected in the other views, as well as
numerical statistics on this frame (number of matches
and values of the model metrics). These ‘details on
demand’ allow refining the insight obtained from the
overviews. All views are linked by interactive selec-
tion – clicking on a time-instant or position in the
overviews shows details of the selected frame in the
frame data view. For instance, the frame data in Fig. 7
corresponds to the moment C discussed above. As
visible in the amplitude image, the two back teats
are now connected to the suction cups of the milk-
ing robot. In such cases, tracking is expected to be
lost (due to the robot being too close to the udder).
Hence, we have explained that the tracking-lost event
observed in the TTS and model-state views is ex-
pected and not due to a tracker problem.
The analysis tool allows browsing a video both
frame by frame or playing it in real-time, so that
correlations between tracking performance and algo-
rithm variables can be easily seen. Using this tool, we
have been able to refine our proposed detection-and-
tracking algorithms, fine-tune their parameters, and
also validate the end-to-end tracking performance of
our system. Overall, we have tested over 15 real-life
videos of several minutes each acquired in actual sta-
bles in a production-process environment, that cover
a wide range of camera-to-subject distances, angles,
and motion paths. Average tracking performance
amounts to over 90% of the frames being success-
fully tracked. This clearly exceeds the documented
performance of comparable systems (LMI Technolo-
gies, 2012; Scott Milktech Ltd., 2013; MESA Imag-
ing, 2014; Westberg, 2009; Hunt, 2006).
6 CONCLUSIONS
We present an end-to-end system for the detection of
cow teats for automatic milking devices (AMDs) in
the milk industry. We present several techniques and
algorithms that make this detection robust and fully
automated when using a very low resolution time-of-
flight camera, which renders classical computer vi-
sion algorithms not applicable. By combining depth
and point cloud information analysis with observed
model priors, we achieve a simple and robust imple-
mentation that can successfully track over 90% of the
frames present in typical AMD videos, which exceeds
the performance of all known competitive solutions
in the area. In contrast to these solutions, our pro-
posal is also fully automated, allows large relative
camera-subject motions and orientation changes, and
accounts for occlusions. We present a visual analytics
tool that allows tracker refinement and result valida-
tion.
Several extension directions are possible. Differ-
ent teat detectors can be designed to find teats more
accurately under extreme zoom-out conditions, e.g.
based on 3D template matching. Secondly, using a
more complex model including both teats and udder
shape should render our tracking performance even
higher in contexts of high occlusion. Such refine-
ments will lead to a more effective solution for the
next generation of AMD robots for the dairy industry.
REFERENCES
Agarwal, A. and Triggs, B. (2006). Recovering 3D hu-
man pose from monocular images. IEEE TPAMI,
28(1):44–58.
Baya, H., Essa, A., Tuytelaars, T., and Gool, L. V. (2008).
Speeded up robust features. CVIU, 110(3):346–359.
Chen, D., Farag, A., Falk, R., and Dryden, G. (2009). A
variational framework for 3D colonic polyp visualiza-
tion in virtual colonoscopy. In Proc. IEEE ICIP, pages
2617–2620.
Dey, T. and Goswami, S. (2004). Provable surface recon-
struction from noisy samples. In Proc. SCG, pages
428–438.
Dey, T., Li, K., Ramos, E., and Wenger, R. (2009). Isotopic
reconstruction of surfaces with boundaries. CGF,
28(5):1371–1382.
Distante, C., Diraco, G., and Leone, A. (2010). Active range
imaging dataset for indoor surveillance. Ann. BMVA,
21(3):1–16.
Dorrington, A., Payne, A., and Cree, M. (2010). An
evaluation of time-of-flight range cameras for close
range metrology applications. ISPRS J. Photogramm.,
38(5):201–206.
Hoppe, H., DeRose, T., Duchamp, T., McDonald, J., and
Stuetzle, W. (1992). Surface reconstruction from un-
organized points. Proc. ACM SIGGRAPH, 26(2):71–
78.
Hovinen, M., Aisla, A., and Py¨or¨al¨a, S. (2005). Visual de-
tection of technical success and effectiveness of teat
cleaning in two automatic milking systems. J. Dairy
Sci., (88):3354–3362.
Hunt, A. (2006). Teat detection for an auto-
matic milking system. In MSc thesis, Univ.
RobustandFastTeatDetectionandTrackinginLow-resolutionVideosforAutomaticMilkingDevices
529