Designing Audio-Augmented Environments

Andreas Zimmermann and Lucia Terrenghi

Fraunhofer Institute for Applied Information Technology, Schloss Birlinghoven,

53754 Sankt Augustin, Germany

Abstract. The paper deals with the design of an intelligent user interface aug-

menting the user experience in a museum domain, by providing an immersive

audio environment. We highlight the potential of augmenting the visual real

environment in a personalized way, thanks to context modeling techniques. The

LISTEN project, a system for an immersive audio augmented environment ap-

plied in the art exhibition domain, provides an example of modeling and per-

sonalization methods affecting the audio interface in terms of content and or-

ganization. In addition, the outcomes of the preliminary tests are here reported.

1 Introduction

One core idea of ubiquitous computing [13] is that items of our daily environments

will acquire computational capacities, e.g. sensing data in the context, elaborating

them and adapting the system to the context of interaction, will provide new func-

tionalities and enable new activities. The physical reality will be overlaid by an addi-

tional virtual layer (Mixed Reality) and by naturally moving in the space and/or by

manipulating physical objects in our surroundings we will act upon information in the

virtual layer. This enables new augmented user experience and funds the basis for

new challenges in user interface design.

The LISTEN project [7] conducted by the Fraunhofer Institute for Media Commu-

cation is an attempt to make use of inherent “everyday” integration of the aural and

visual perception [3]. In October 2003 this system was applied for the visitors of the

August Macke art exhibition at the Kunstmuseum in Bonn, in the context of the

“Macke Labor” [12]. The users of the LISTEN system move in the physical space

wearing wireless headphones, which are able to render 3-dimensional sound, and

listen to audio sequences emitted by virtual sound sources placed in the environment.

The visitors of the museum experience personalized audio information about exhibits

through their headphones. The audio presentation is adapted to the users’ contexts

(i.e. interests, preferences, and motion), providing an intelligent audio-based envi-

ronment [10].

Museum and exhibitions as domains have already been

explored in several research

projects. Many of them focus on the goal to provide a museum guide including con-

tent concerning the artworks [8], to immerse the user in a virtual augmented environ-

ment built in virtual museums [1], or to provide orientation and routing functionalities

Zimmermann A. and Terrenghi L. (2004).

Designing Audio-Augmented Environments.

In Proceedings of the 1st International Workshop on Ubiquitous Computing, pages 113-118

DOI: 10.5220/0002673201130118

 SciTePress

[2]. The LISTEN project takes the challenge to provide a personalized immersive

augmented environment, which goes beyond the guiding purpose, and is based on the

combination of aural and visual perception. The viewers become inter-actors as soon

as they move in real space. Thus, the physical behavior of the user determines the

interaction. This dissertation focuses on the design and adaptation of this user inter-

face, and serves as a proof of concept for our approach.

2 The Combination of Content and Context

While using the LISTEN system, users automatically navigate an acoustic informa-

tion space designed as a complement or extension of the real space. The selection,

presentation and adaptation of the content of this information space take into account

the user’s current context. For the flexible and easy combination of the appropriate

content and the current context, we chose to provide a centralized Java based context

and user modeling server. In addition, we use XML as the modeling language, which

allows for a intuitive design, definition and configuration of the system components.

Since this design tool will be used in various follow-up projects, we introduced a new

model for the design of context-aware systems (cf. Figure 1).

Fig. 1. The Context-Aware Architecture of the LISTEN System

In the following we describe the four steps of this abstracted model: information

collection, modeling, controlling and rendering. In addition, we show the instantiation

of this model by the implementation of the LISTEN at the museum (cf. Figure 2).

2.1 Information Collection

A sensor network placed in the environment monitors variable parameters and recog-

nizes changes within the environment. A sensor server receives all incoming events

(i.e. user interactions) sent by the application. These event descriptions are pushed

114

into a database. Thus, an event history for every user is saved and an implicit user

profile is recorded. The LISTEN system allows monitoring of three sensor values

only: the user’s spatial position and head orientation [4] and the time.

Fig. 2. The Instantiation of the Abstract Model

2.2 Modeling

In a second step the incoming sensor values are interpreted by several algorithms (e.g.

machine learning or data mining algorithms) based on different models (e.g. overlay,

statistic, etc.). Thus, semantically enriched information is extracted and more signifi-

cant knowledge is gained relating to the user’s behavior.

A location model, which segments the exhibit hall into object zones and near fields

[5], allows the LISTEN system to interpret the user’s position (i.e. location) and head

orientation (i.e. focus). The most important variable within the user modeling process

it the time. The analysis of the user’s spatial position and the time results in the speed

the user moves with. In addition, a combination of meta-information concerning the

paintings and the time allows an assumption about the user’s interests (e.g. the more

time the visitor spends with a specific exhibit, the more s/he likes it) [9].

Furthermore, we chose to employ stereotypes to define the user’s observation type.

In contrary to [8] the user’s classification into one stereotype is done manually by the

user and cannot be changed automatically during runtime. Additionally, the personal-

ization engine of the LISTEN application is not able to perform an automated cluster-

ing and derivation of new stereotypes.

The complete model of the context we use in LISTEN is represented by a set of at-

tribute-value pairs.

115

2.3 Controlling

In a third step a controlling component is necessary to decide what consequences

must be taken if certain conditions in the user’s context and in the individual user

model configuration appear together. Based on these information sources, the control

component assembles a sequence of commands in order to adjust certain variable

properties of the environment. Different sequences of commands lead to different

kinds of information presentation.

A XML-configurable rule-system controls the sound presentation of the LISTEN

scenery. First, a pre-filtering of sound-items is performed based on the user’s current

location and focus. From this list the best suited sound-item is selected referring to

the user’s visit history, stereotype and interests.

2.4 Rendering

Rendering means handling the connection back to the domain. This engine trans-

lates the assembled sequence of domain-independent commands into domain-

dependent commands changing variable parameters of the domain. Thus, the deci-

sions taken by the controlling component are mapped to real world actions.

Concerning the LISTEN system rendering means changing the audio augmenta-

tion. Basically this denotes a certain sound-item being played from a specific sound

source. But moreover, this sophisticated auditory rendering process takes into account

the current position and orientation of the user’s head, in order to seamlessly integrate

the virtual scene with the real one [6].

3 Adaptation Means

The basis for every kind of adaptation in LISTEN is the presentation of the sound

space. Besides the selection of which sound item is to be played, other dimensions

influence the sound presentation: for instance when, with which character (e.g. vol-

ume), from which direction, with which motion and how long a sound is played. With

combinations of these possibilities, a wide range of adaptability is accomplished.

Additionally, other means of adaptation are realized or planned.

3.1 Adaptation of the Space Model

In LISTEN users enter and leave zones in virtual space (cf. Section 2.2 and [11]).

Some users want to step back and look at the object from a different viewpoint. Be-

cause the user still shows interest in this specific object, the associated zone should

adapt to the user and expand up to a predefined point (zone breathing). Then the user

is able to listen to the sound further on.

116

3.2 Adaptation to Social Context

By interpreting the time and the location of several users, the LISTEN system is able

to identify clusters of people (spatially and temporally similar users). These people

might want to receive the same audio information (e.g. like a family). Vice versa,

breaking up clusters of people is also possible. This would lead to a better distribution

of people among several objects.

3.3 Adaptation to the Level of Immergence

In LISTEN, interest in objects is expressed by the time a user’s focus lingers on these

objects. The complexity, the amount, and the style of already received information

about one object corresponds to the level of interest. The sound presentation directly

steps in the adequate level of information depth and style, if the currently seen exhibit

complies with the user’s interests.

3.4 Adaptation to Movement and Reception Styles

People walking through the environment show different kinds of common behavior

(e.g. clockwise in museums [8]). Special attractor sounds emitted from different

sound sources are used to draw the user’s attention on certain objects. Thus, entire

predefined tours through the environment are recommended (by artist, curator, per-

sonal interest) and dynamically adapted to the stereotypical type of movement and

perception style.

4 Conclusions and Evaluation

In this paper we attempt to highlight the potential of tailored audio augmented envi-

ronments design, a research field that needs to be further explored and still presents

critical issues. The project here described provides a good example of the complexity

of the modeling and design process of a tailored audio augmented environment.

The outcomes of first test at the Kunstmuseum in Bonn enabled a preliminary

evaluation: impressions and refinement issues were brought out by 15 visitors, art

curators, sound designers, and artists.

As a first positive result all the visitors enjoyed the combination of audio-visual

perception and felt an augmentation of the environment. At the same time, both cura-

tors appreciated the possibility to deliver content concerning the artworks in an inno-

vative, enriched, and less descriptive way.

As critical points eight testers mentioned problems to “feel” the interaction with

the environment and four even reported that they were forced to approach the artwork

very closely. These statements show the importance of a more dynamic space model

and the necessity of landmarks in the virtual environment navigation, we are currently

working on.

117

In addition, five users could not realize whether the changes in the audio virtual

environment were due to their movements in the space or were part of the audio se-

quence. This misinterpretation of the user’s spatial position in combination with the

focus is currently under investigation.

Four users missed an explicit control channel for switching between different types

of sound presentation. Since the LISTEN systems aims at a complete immersion of

the users only through wireless headphones this issue will not be further pursued.

A main criticism in the evaluation was the use of stereotypes for content adapta-

tion. The people do not like to be clustered and classified by their personal informa-

tion consumption habits (fact oriented, emotional or overview oriented). The refined

system will provide stereotypes that represent the user’s moving styles like saunter-

ing, goal-driven or standing still, because these stereotypes have less social impact,

and are easy to detect and to revise.

References

1. Chitarro, L./Ranona, R./Ieronutti, L.: “Guiding Visitors of Web3-dimensional Worlds

through Automatically Generated Tours”. In: Proceedings of Web3D 2003: 8th Interna-

tional Conference on 3D Web Technology (2003).

2. Ciavarella, C./Paterno’, F.: “Supporting Access to Museum Information for Mobile Visi-

tors”. In: 10th International Conference on Human-Computer Interaction. Crete, Greece

(2003).

3. Eckel, G.: “LISTEN – Augmenting Everyday Environments with Interactive Soundscapes”.

In: Proceedings of the I3 Spring Days Workshop "Moving between the physical and the

digital: exploring and developing new forms of mixed reality user experience", Porto, Por-

tugal (2001).

4. Goiser, AMJ.: Handbuch der Spread-Spectrum Technik. Springer-Verlag Wien New York

(1998).

5. Goßmann, J., Specht, M.: “Location Models for Augmented Environments“. In: the Work-

shop Proceedings of Location Modelling for Ubiquitous Computing, Ubicomp (2001), 94-

99.

6. Jot, Jean-Marc : Real-time spatial processing of sounds for music, multimedia and interac-

tive human-computer interfaces. Multimedia Systems 7. Springer (1999).

7. LISTEN: http://listen.imk.fraunhofer.de

. The LISTEN Website (2000)

8. Oppermann, R., Specht, M.: “A Context-sensitive Nomadic Information System as an Exhi-

bition Guide”. In: Proceedings of the Handheld and Ubiquitous Computing Second Interna-

tional Symposium, HUC 2000, Bristol, UK (2000), 127 – 142

9. Pazzani M. J./Billsus D.: “Learning and Revising User Profiles: The Identification of Inter-

esting Web Sites”. In: Machine Learning 27 (1997), 313-331.

10.Sawhney, N./Murphy, A.: “Designing Audio Environments – Not Audio Interfaces”. Paper

submitted to ASSETS '96, The 2nd ACM/SIGCAPH Conference on Assistive Technologies

(1996).

11.Tramberend, H.: “Avango: A Distributed Virtual Reality Framework”. In: Proceedings of

the IEEE Virtual Reality ’99 Conference, Houston,Texas, USA (1999).

12.Unnützer, P.: “LISTEN im Kunstmuseum Bonn, KUNSTFORUM International, Vol. 155

(2001), pp. 469/70.

13.Weiser, M.: "The Computer for the Twenty-First Century," Scientific American (1991), pp.

94-10.

118