1.2 Paper Outline
This paper starts by introducing the related work in
person tracking and personal informatics in section 2.
The application for office ergonomics and the pro-
posed vision system are defined in section 3. The
multi-camera recording setup and the data used in
the experiments are presented in section 4. Section 5
presents the results on single-camera ergonomics
analysis and on fusion experiments on general mobil-
ity. The paper concludes with discussion in section 6.
2 RELATED WORK
Gaze tracking has been used for many applications
from analyzing the impact of advertisements for mar-
keting studies, to developing innovative interfaces for
HCI (Hansen and Ji, 2010). Most widely used meth-
ods are based on video-devices, because they are un-
obtrusive and cheap. Much work has been done to im-
prove the performance, e.g., by using prior knowledge
about the scene under a saliency framework (Valenti
et al., 2012), or by incorporating multiple cameras
(Chen and Aghajan, 2011). In this paper the esti-
mation of gaze was simplified into a common head
tracking problem.
For eye blinking detection, (Chau and Betke,
2005) proposed an approach in which eye location is
detected from a temporal difference image when the
user blinks, and templates for open eyes are created
on-line. Local template matching tracks the eye loca-
tion, and blinks are detected by thresholding the cor-
relation score. A blink detector using GPU based on
SIFT tracking was proposed in (Lalonde et al., 2007).
In this paper the eye locations are given by the tracked
head, and blinks are adaptively detected based on the
accumulated pixel differencies of the estimated loca-
tions.
The detection and tracking of people is a necessity
for many applications, but they do oppose challeng-
ing problems due to the cluttered environments with
occlusions, moving background objects, and multi-
ple people. For example, a framework that exploits
both detection and tracking methods for an articulated
body model for tracking multiple people has been
proposed in (Andriluka et al., 2008). In this paper
we’ve applied a combination method of image seg-
mentation and template matching, because the inter-
est is not in the specific posture of a person, but in
the mobility of the tracked person. Person tracking
can help to leviate privacy concerns by focusing the
analysis on the specific person only, and thus ignor-
ing individuals who want to remain anonymous.
Gathering comprehensive personal information
has been made possible recently with the advent of
ubiquitous sensors and computing power. A survey
about how personal information is collected through
ubiquitous sensors and reflected upon can be found
in (Li et al., 2010). For example, the generation of
a daily activity summary for triggering bad posture
alarms was proposed in (Jaimes, 2005). In this paper
we are interested in gathering specific properties re-
lated to office ergonomics, such as head mobility and
rest breaks.
Detection of body posture and interactions with
other people, are essential for improving wellbeing.
A 20-year study (Shirom et al., ) found a strong link
between higher level of peer social support and low-
ered risk of mortality. (Chen and Aghajan, 2011) de-
scribed methods for estimating the locations and head
orientations of multiple users. Based on these two at-
tributes, an interaction detector was trained to identify
social events. The influence of these social events on
behavior was studied in (Chen et al., 2011). In this
paper we suggest to compare the inferred office be-
havior to the official ergonomic guidelines, and use
these comparisons to drive the adaptive recommenda-
tion system.
3 PROPOSED VISION SYSTEM
In the proposed vision system, there are two main cat-
egories of cameras; the personal webcam and the am-
bient cameras. Additionally, an ambient camera that
observes only the area of a person’s desk is referred
to as a dedicated camera.
3.1 Application to Office Ergonomics
Ergonomics guidelines usually only provide high-
level recommendations that are general for specific
industry or task, but do not take into account per-
sonal preferences and habits. Therefore, warnings
that strictly adhere to the guidelines might become an-
noying to the users, and could even jeopardize work
efficiency and productivity. To address this problem,
a multi-camera supported system that learns personal
habits and preferences is proposed. The overview of
the discussed system is illustrated in Figure 2.
The frontal personal camera above the user’s com-
puter screen extracts ergonomics related attributes.
The ambient cameras monitor the entire office and
record how multiple users utilize the office space.
Data extracted by these cameras is sent to a central
processing unit. The attributes are first combined
by a data fusion process and then used to learn the
VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications
404