of the associated algorithms (e.g., Viola-Jones
face detector, Support Vector Machine classifier).
• Easily portable to different platforms. The ap-
plication must run on a mobile device (e.g.,
Android-based smartphone) mounted on the ve-
hicle’s dashboard. Ideally, it should be easily
portable to other (e.g., iOS-based) mobile devices
of comparable size and computational capabili-
ties.
• Computationally non-intensive. Since (near) real-
time performance is required, algorithms must
be optimized to ensure continuous monitoring of
driver’s state without excessive burdening of the
device’s main processor. As a side benefit, battery
consumption is reduced as well.
• Accuracy. One of the main challenges of design-
ing such a system is related to the fact that both
type I and type II errors are highly undesirable,
for different reasons: type I errors (false positives)
will annoy the driver and reduce their willingness
to use the system (due to excessive false alarms),
whereas type II errors (false negatives) can have
literally catastrophic consequences and defeat the
purpose of the entire system.
• Robustness. The system must be tolerant to mod-
est amounts of lighting variations, relative cam-
era motion (e.g. due to poor road conditions),
changes to the driver’s visual appearance (even in
the course of a session, e.g., by wearing/removing
a hat or sunglasses), camera resolution and frame
rates, and different computational capabilities of
the device’s processors.
Some of the anticipated constraints and limitations
faced by the proposed system include:
• Lighting conditions. Frequent and drastic change
in darkness or brightness of a scene (or part of it),
which may happen even during the shortest driv-
ing intervals, have been proven to be a significant
challenge for many computer vision algorithms.
• Camera motion. Poor road conditions as well as
a more aggressive style of driving can introduce
significant amount of vibrations and discomfort
to the driving experience. Those vibrations can
be passed onto the camera and cause distortion in
the images which can significantly skew the re-
sults and decrease the overall performance of the
system.
• Relative positioning of device. The camera must
be positioned within a certain range from the
driver and within a certain viewing angle. Every
computer vision algorithm has a “comfort zone”
in which it performs the best and most reliably.
If that comfort zone is left, performance can be
dropped significantly.
• Hardware and software limitations. Typical mo-
bile devices have one or two processor cores,
reduced working memory and tend to work on
lower clock frequencies, compared to their desk-
top counterparts. The reason for all of this is to
reduce the energy consumptionbut it creates a sig-
nificant obstacle in designing this type of system.
• Driver cooperation. Last, but certainly not least,
all driver drowsiness detection systems assume a
cooperative driver, who is willing to assist in the
setup steps, keep the monitoring system on at all
times, and take proper action when warned by the
system of potential risks due to detected drowsi-
ness.
2.2 System Architecture
Our driver drowsiness detection system consists of
four main stages (Figure 1):
Figure 1: Four stages of Drowsiness Detection System
2.2.1 Detection Stage
This is the initialization stage of the system. Every
time the system is started it needs to be set up and op-
timized for current user and conditions. The key step
in this stage is successful head detection (Figure 2). If
the driver’s head is correctly located we can proceed
to extract the features necessary for setting up the sys-
tem. Setup steps include: (i) extracting driver’s skin
color and using that information to create custom skin
color model and (ii) collecting a set of open/closed
eyes samples, along with driver’s normal head posi-
tion.
To help achieve these goals, user interaction might
be required. The driver might be asked to sit comfort-
ably in its normal driving position so that system can
determine upper and lower thresholds needed for de-
tecting potential nodding. The driver might also be
asked to hold their eyes closed and then open for a
matter of few seconds each time. This is enough to
get the system started. Over time, the system will ex-
pand the dataset of obtained images and will become
more error resistant and overall more robust.
SIGMAP2014-InternationalConferenceonSignalProcessingandMultimediaApplications
242