2 RELATED WORK
Current paper-based activities and practices are
highly disseminated and intrinsic to our daily lives.
Particular cases such as therapeutic and educational
procedures, which rely strongly on paper-based
activities, assume special importance due to their
critical content. However, given the underlying
medium, some of the activities fall short of their
goals. Moreover, the ability to introduce digital data
and multimodalities can enhance the activities and
facilitate users’ lives. Mobile multimodal
applications have been emerging more as the
technology evolution starts to enable their support.
Several systems, which combine different
interaction modalities on mobile devices, have
already been developed. The approaches vary in the
combination of modalities that, generally, suit
different but specific purposes, which address the
users’ needs and surrounding environments.
Studies on multimodal mobile systems have
shown improvements when compared to their
unimodal versions (Lai, 2004) and several
multimodal systems have been introduced on
different domains. For instance, mobile systems that
combine different interaction modalities in order to
support and extend specific paper-based activities
have been used with success in art festivals (Signer,
2006) and museums (Santoro, 2007). The latter also
supports visually impaired user interaction. Still,
both are extremely specific, targeting activities that
occur in particular, and controlled, environments.
Other approaches focus mainly on the
combination of interaction modalities in order to
eliminate ambiguities inherent to a specific
modality: speech recognition (Hurtig, 2006)
(Lambros, 2003). However, once again, they focus
specific domains and use the different modalities
only as a complement to each other.
Closer to our goals ACICARE (Serrano, 2006)
provides a good example of a framework that was
created to enable the development of multimodal
mobile phone applications. These rely on: command
based speech recognition, keypad (for input) and
visual display (for output). The framework allows
rapid and easy development of multimodal
interfaces, providing automatic usage capture that is
used on the evaluation of the multimodal interface.
However, the creation and analysis of these
interfaces cannot be done in a graphical way, thus
not enabling users with no programming experience
to take profit of this tool. Moreover, modalities
relying on video are not considered and the
definition of behaviour that responds either to the
user interaction, navigation or to external events
falls out of their purposes.
Finally, none of the work found in the available
bibliography enables users, without programming
experience, to create, distribute, analyse and
manipulate multimodal artefacts that suit different
purposes, users and environments. Furthermore,
most of the existing multimodal mobile applications
rely on a server connection to perform their tasks,
limiting their mobility and pervasive use.
3 MOBILE MULTIMODAL
ARTEFACT FRAMEWORK
The original framework was developed to enable the
creation and manipulation of mobile artefacts that
support and extend paper-based procedures and
activities. As the framework utilization evolved, we
faced new challenges that clearly pointed to its
extension through the inclusion of multimodalities.
Four main tools compose the framework: the
Creation Tool allows users to create multimodal
interactive/proactive artefacts (e.g., role play games,
dynamic questionnaires and activity guides); the
Manipulation Tool enables the instantiation and
manipulation of the artefacts (e.g. playing the
games, filling the questionnaires and registering
activities); the Analysis Tool, actually a set of tools,
provides mechanisms to analyse and annotate
artefact manipulation and results (e.g. see how and
when the game was played, the questionnaires were
filled); and the Synchronization Tool, handles the
transfer of artefacts and results between devices. All
tools are available for Microsoft’s OS in
Desktop/TabletPCs and PDAs/Smartphones and
were developed in C#. A simpler J2ME version,
tested in PalmOS Garnet 5.4, is also available.
In this paper, we focus on the creation and the
manipulation tool, since those were the main targets
of the multimodal extensions and the analysis and
synchronization tools required only minor
modifications.
3.1 Mobile Multimodal Artefacts
Artefacts are an abstract entity composed by an
ordered set of pages and a set of rules. Pages contain
one or more elements. These are the interaction
building blocks of artefacts (e.g., labels, selectors)
and are arranged in space and time within a page.
Rules can alter the sequence of pages (e.g., <skip
to page X>) or determine their characteristics (e.g.,
DESIGNING MOBILE MULTIMODAL ARTEFACTS
79