2 STATE OF THE ART AND
REQUIREMENTS.
Although traditionally MoCap systems for moving
virtual actors have been used in Virtual Reality
applications, with the growth of Augmented Reality
(AR) applications, these systems are used to control
a user in the real-world interacting with virtual
objects in the augmented scene.
One of the earliest examples found is the ALIVE
system (Maes, 1997), where a video camera captures
images of a person in order to detect the user’s
movements but also to integrate the real image onto
a virtual world. This can be considered a precursor
of AR, where video images are integrated with 3D
objects (like a dog) that are activated by the user’s
movements.
Several AR systems make use of markers
(usually geometric figures) to detect positions
(Mulloni, 2009). However, these systems do not
provide enough accuracy to move a virtual character
and all his joints. Sometimes optical systems with
markers (Dorfmuller, 1999) are used to detect
positions but, in other cases, more complex systems
are used that are based on ultrasound or inertial
sensors that detect movements in a wide range of use
(Foxlin,1998;Vlasic, 2007).
Regarding our system requirements, it is
necessary to track the positions of different body
parts in order to move an avatar representing the
user. Moreover, taking into account the users to
whom it is directed, who frequently experience
sensory difficulties (Bogdashina, 2003) it is better
that users do not wear complex devices. Thus,
mechanical and electromagnetic and other invasive
devices have been discarded from the beginning.
The release of the PrimeSense OpenNI for
programming Kinect in December 2010, opened up
new possibilities for us. Kinect incorporates an RGB
camera (640x480 pixels at 30Hz) and a depth sensor.
This is especially suitable to be used in an A.R.
system where images from the real word are needed.
The depth sensor provides distance information
which is useful to create the augmented scene
placing objects correctly.
Some researchers have started to use it in their
applications to track people (Kimber, 2011), but
there are no references of the use of this device in
Augmented Reality applications to control avatars.
The PrimeSense OpenNI (OpenNI, 2011)
provides information about positions and
orientations of a number of skeleton joints, which
can be used to control our virtual puppet. The first
tests performed with this system were satisfactory
regarding motion capture. However, it has a negative
aspect: the calibration requires the user to remain
still for a few seconds in a certain posture in front of
the camera. As the system is intended to train the
individual with autism to match postures, as one of
the final educational objectives, it makes no sense
that he/she has to be able to copy a posture in order
to enter the game. With the release of Microsoft’s
SDK the calibration problem was solved, this made
us choose it as a solution for our system.
Thus, we have got a system that captures user
positions in varying lighting conditions and without
dress requirements or markers.
3 SYSTEM DESCRIPTION
After comparing the pros and cons of different
motion capture devices a Pictogram Room using
Kinect has been made. This system consists of a
visualization screen measuring 3 x 2 meters, a
projection or retro-projection system (depending on
the room where it is to be installed), a PC, a Kinect
device and speakers. Kinect is equipped with two
cameras, an infrared camera and a video camera
with a 640x480 pixel resolution, and a capture rate
of 30 fps. So using Kinect it is possible to obtain not
only a standard video stream, but also a stream of
depth-images.
On the screen, images captured by Kinect are
displayed mixed with virtual information, creating
for users an augmented mirror where they can see
themselves integrated onto the augmented scene.
The system is designed to be used by users
playing with two different roles: child and educator.
Each user is represented by a virtual puppet colored
differently. Both users are located next to each other
and they are tracked in the same space.
The task of the educator is to select exercises and
activities that will be developed by the child, and
after that to give him/her appropriate explanations.
To achieve this, the teacher is provided with a menu
system displayed on the screen which can be
accessed by using the hand as a pointer. Once an
exercise is selected, the child’s and teacher’s
movements are captured and used as the interaction
interface with the system.
This system has been implemented by creating a
set of subsystems that deal with different tasks. On
the one hand, the input system allows you to choose
activities and capture the user’s actions and
movements. Depending on the activity, and the
user’s actions, the output system creates an
augmented environment by integrating images from
A KINECT-BASED AUGMENTED REALITY SYSTEM FOR INDIVIDUALS WITH AUTISM SPECTRUM
DISORDERS
441