Audio-visual Cues for Cloud Service Monitoring

David Bermbach and Jacob Eberhardt

Information Systems Engineering Research Group, Technische Universit

at Berlin, Berlin, Germany

Keywords:

Cloud Services, Monitoring, Quality of Service.

Abstract:

When monitoring their systems’ states, DevOps engineers and operations teams alike, today, have to choose

whether they want to dedicate their full attention to a visual dashboard showing monitoring results or whether

they want to rely on threshold- or algorithm-based alarms which always come with false positive and false

negative signals. In this work, we propose an alternative approach which translates a stream of cloud moni-

toring data into a continuous, normalized stream of score changes. Based on the score level, we propose to

gradually change environment factors, e.g., music output or ambient lighting. We do this with the goal of

enabling developers to subconsciously become aware of changes in monitoring data while dedicating their

full attention to their primary task. We evaluate this approach through our proof-of-concept implementation

AudioCues, which gradually adds dissonances to music output, and an empirical study with said prototype.

1 INTRODUCTION

Modern cloud-based enterprises increasingly follow

the you build it, you run it paradigm where small

teams of software engineers no longer only develop

an application and then hand it over to other teams for

testing and operation. Instead, the developers are also

responsible for running the application, i.e., deploy-

ing and maintaining the system, so that their respon-

sibility shifts from providing a piece of code to pro-

viding a system with strict SLAs to their customers.

This is also referred to as DevOps (Bass et al., 2015).

While this has many advantages, it also confronts

developers with tasks that traditionally never were

theirs to do. A good example for this is monitor-

ing: While a traditional enterprise may have a ded-

icated operations team that devotes its full work-

force to closely observe and, where necessary, man-

age application state, this suddenly becomes a side

task for application developers – a burden they are

ill equipped to handle. Some of this complexity

can be alleviated through automation: threshold-

based mechanisms (e.g., Amazon Autoscaling

) or

machine-learning based approaches can take auto-

matic action to resolve issues (e.g., by spawning ad-

ditional VMs) or notify developers through alarms.

Still, automatic action cannot fully replace human

oversight and alarms are inherently limited by their

binary state – on or off – leading to a trade-off be-

aws.amazon.com/autoscaling

tween false positive and false negative alarm states.

In this paper, we propose a framework and ap-

proach that leverages the ability of the human subcon-

scious to detect deviations from a “normal” state. For

this purpose, we use cloud monitoring results to con-

trol various aspects of the developers’ environment,

thus, enabling them to subconsciously become aware

of faulty system states, e.g., through color changes

in ambient lighting. In contrast to traditional alarms,

these environment factors can often be changed grad-

ually so that, for lack of a binary decision, false pos-

itives and negatives become a thing of the past. Fur-

thermore, the likelihood of a signal moving from the

subconscious to a state of awareness highly depends

on the intensity and regularity of the signal as well

as the person’s current level of concentration (Vick-

ers, 2011), i.e., developers will in periods of full con-

centration only become aware of critical system states

whereas they will in periods of low concentration also

become aware of smaller issues.

For this purpose, we propose the following contri-

butions:

• MultiSense, a high-level framework and architec-

ture of a system that uses monitoring data to con-

trol environmental parameters.

• AudioCues, as an instantiation of MultiSense that

uses monitoring data to manipulate and create un-

obtrusive music which people may listen to while

working.

This paper is structured as follows: In sect. 2,

Bermbach, D. and Eberhardt, J.

Audio-visual Cues for Cloud Service Monitoring.

DOI: 10.5220/0006301804670474

In Proceedings of the 7th International Conference on Cloud Computing and Services Science (CLOSER 2017), pages 439-446

ISBN: 978-989-758-243-1

439

we will discuss basic literature on leveraging the sub-

conscious for presenting information to users. Then,

in sect. 3, we present MultiSense and discuss envi-

ronmental factors that could be controlled through it.

Next, in sect. 4, we present AudioCues and its proto-

typical implementation, describe how we use it to cre-

ate lounge-like music, and show up future extensions.

Afterwards, in sect. 5, we describe our evaluation, be-

fore discussing related work in sect. 6.

2 BACKGROUND

In this section, we will discuss based on literature

why audio as a non-disruptive information channel is

a good choice for observing cloud monitoring data as

we did in our AudioCues prototype (sect. 4).

Generally, monitoring of processes

can be done

in three distinct ways (Vickers, 2011): Direct moni-

toring (the focus of attention lies on monitoring a pro-

cess), Peripheral monitoring (the focus of attention

lies on another task, monitoring of a process is per-

formed passively and attention shifts in case of critical

system states), and Serendipitous-peripheral monitor-

ing (non-critical information is passively monitored,

the focus is on another task).

Direct monitoring requires the user’s attention at

all times and is thus a pull-based approach. Visual

dashboards and comparable data representations tied

to a screen are examples for technologies of this cat-

egory. Peripheral as well as serendipitous-peripheral

monitoring, in contrast, are push-based approaches,

that draw a user’s attention to the monitored process

only when necessary. Technologies enabling periph-

eral awareness, e.g., ambient lights or AudioCues, are

often referred to as peripheral displays.

Peripheral displays that use audio as primary

transmission mechanism while extending traditional,

non-peripheral monitoring systems, are called audi-

tory displays; the process of translating data into au-

dio signals is called soniﬁcation. They provide two

distinct advantages: First, information can be trans-

mitted to users without being disruptive or obtru-

sive (Jenkins, 1985; Weiser and Brown, 1997; Tran

and Mynatt, 2000). Second, auditory displays can in-

crease the bandwidth of computer-human interaction

by providing an additional channel for information

transmission (Vickers, 2011). This can also be seen

in the results of Barra et al. (Barra et al., 2001) who

could show that peripheral audio monitoring allows

users to extract meaningful information while not get-

ting distracted from their primary task as opposed to

Not limited to “technical”, OS-like processes.

visual displays where this is not the case (Maglio and

Campbell, 2000).

As auditory display, we can use speech, music,

sound effects, or any combination thereof. Speech, in

most contexts, carries foreground information and re-

quires more attention than non-speech audio (Mynatt

et al., 1998). Sound effects, e.g., trafﬁc noise or bird-

song, are used by various systems (Barra et al., 2001;

Liechti et al., 1999). These sound effects are very

suitable to signal the occurrence of a (binary) event,

e.g., when a server has failed, or to provide unobtru-

sive ambient noise. The main advantage of sound ef-

fects is that users can identify the source of the event

if the acoustic cue is semantically related, e.g., the

sound of pouring a drink instead of a progress bar.

However, mapping a stream of values, e.g., as pro-

vided by cloud monitoring, to sound effects seems to

be difﬁcult

. This is why we chose music as a medium

for our initial prototype: It offers more options for

manipulation than sound effects and can still enable

users to recognize the source of the event if an expla-

nation of the mapping of monitoring data to music is

provided (Lucas, 1994).

3 MULTISENSE

In this section, we will brieﬂy recapitulate MultiSense

and its main components (Bermbach and Eberhardt,

2016) as well as discuss which environmental param-

eters can be controlled in which way.

Architecture and Components: Generally, Mul-

tiSense has a sensor-actuator model comprises three

parts, two of which are external services that are inte-

grated through adapter mechanisms.

The ﬁrst part (ﬁg. 1 on the left) comprises our

metric producers. These can be any monitoring ser-

vices that produces data points for one or more met-

rics. Examples include open source systems, e.g.,

Ganglia

, custom solutions, or cloud services,e.g.,

Amazon CloudWatch

. Any data source accessible

through pull or push mechanisms can be used.

Within the second part, metric consumers peri-

odically poll their metric producer adapters for re-

cent monitoring data from the underlying services and

transform this stream of data points into a stream of

standard monitoring events. The stream is passed on

to metric monitors

which serve as some kind of in-

See also our discussion of control targets in sect. 3.

ganglia.info

aws.amazon.com/cloudwatch

Different metric monitors have no interdependencies as

they are each responsible for different metrics and parts of

the monitored system. This allows MultiSense to scale.

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

440

Metric

Monitor

Metric

Consumer

Pub/Sub System

Metric

Analyzer

[0;100]

CPU

Utilization

Memory

Utilization

Disk I/O

Request

Latency

Failure Rate

…

Metric

Consumer

…

Metric Producers Control Targets

Figure 1: High-Level Architecture of MultiSense.

formation hub. For each registered metric, they have

a single metric analyzer which normalizes the events

of the stream on a range from 0 (normal) to 100 (crit-

ical). MultiSense does not make any assumptions re-

garding the implementation of metric analyzers – they

may implement everything from linear interpolation

to complex machine learning-based techniques. Nor-

malized scores are then sent to a pub/sub system.

The third part, the control targets, comprise phys-

ical and software systems, which are able to affect the

developer’s environment on a (preferably) continuous

scale, as well as the controllers which register for top-

ics of the pub/sub system and then send control com-

mands to their respective control targets based on re-

ceived events.

Control Targets: All control targets share some

basic commonalities: they do not have a binary state

(in fact, a continuum is preferable) and can be con-

trolled electronically. Furthermore, they all affect

how developers feel due to impressions on different

senses. The for our purposes most important senses

are sight and hearing but the sensory capacity of our

skin and the olfactory sense can also be leveraged. In

the following, we will brieﬂy discuss which control

targets affect which senses in which way.

Hearing: Many people like to listen to music or

ambient noise at work. For music, we can have an

audio stream that we can manipulate, e.g., when lis-

tening to MP3 ﬁles, or we may also control the gener-

ation of the music, e.g., via MIDI signals and virtual

instruments as in our AudioCues prototype.

In case of audio streams, there are three basic tun-

ing knobs that we can use: Overall playback volume,

equalizer settings which are basically per-frequency

volumes (a simple way of adjusting would be to

gradually introduce a high-pass or low-pass ﬁlter),

and various kinds of audio effects, e.g., reverb, de-

lay, ﬂanger, phaser, distortion, etc., which can be

mixed into the audio signal. The latter are typically

used with a dry/wet parameter which describes the

(volume-based) percentage of the signal that is routed

through the effect generator. In MultiSense, we could

assign a different effect for each input metric so that

developers will not only notice that something is “off”

but also where the problem is coming from.

When generating music through MIDI signals,

there are additional tuning knobs as we actually con-

trol the output signal. As new parameters, we can

introduce dissonances which we can vary in volume,

kind of dissonance (e.g., minor vs. major second),

or the number of concurrent dissonances (e.g., minor

second vs. minor second and tritone). As we control

individual channels, we can also adjust their respec-

tive volume, i.e., change the overall mix, change the

instrumentation, detune channels through pitch bend

signals, or add imprecision by shifting entire channels

or individual notes slightly in time. On a more global

level, we can change the tempo of playback, e.g., in-

crease tempo if the system is in a stress state, or affect

the “style” of music by changing the way in which we

assemble patterns and sounds.

Sight: Usually, people only notice things that they

are looking at. Still, sudden changes in the peripheral

vision, e.g., movement (Weiser and Brown, 1997),

will cause instant awareness as they are still processed

by the unconscious and, thus, move “from the periph-

ery of our attention, to the center, and back.” (Weiser

and Brown, 1997). For our purposes, movement is a

poor control target as there is no continuous “scale”

of movement: we become instantly aware of move-

ments in our peripheral vision or do not notice them

at all. This leaves us with lighting and the look and

feel of what is happening on the developer’s screen:

For lighting, we can adjust brightness, e.g.,

through dimmable lamps. Ambient lighting is par-

ticularly useful where we can control colors of in-

dividual or all lighting modules, the speed of color

changes, or the continuity of color changes (gradually

fading vs. sudden changes). In terms of program look

and feel, we can gradually change background colors

or text color of program windows and their title bars.

Feeling: Modern smart home appliances can eas-

ily be controlled over standard IT protocols. For in-

stance, we can affect temperature, air ﬂow, or humid-

ity through heating systems, air conditioning, or fans;

air can be scented. There are virtually no limitations

to the range of usable devices.

4 AudioCues

In this section, we will give an overview of Au-

dioCues as an instantiation of MultiSense with spe-

ciﬁc auditory control targets. We will start by describ-

ing the basic functionality of AudioCues in sect. 4.1

before discussing the state of the implementation and

its limitations in sect. 4.2.

Audio-visual Cues for Cloud Service Monitoring

441

Bridge

Intro Part A Part A

Part A

Part B Part B

Part B

Figure 2: Sample Music Graph for AudioCues.

4.1 Basic Functionality

AudioCues focuses on the sense of hearing and is de-

signed to produce pattern-oriented music – likely can-

didates include lounge or other electronic music but

also Bach-inspired music could be realized. For this

purpose, we use a “music graph”, i.e., a directed graph

where each node contains information on the notes

and their instrumentation for short subsequences (typ-

ically patterns) of a piece of music. A player compo-

nent randomly iterates over the graph and schedules

the contents of the respective nodes for playback at

the appropriate time. To reduce the number of very

similar repetitions, our player implementation avoids

going “backwards”, i.e., playing the node sequence

A-B-A. See ﬁg. 2 for an example of a music graph.

Beyond the information on notes and instrumenta-

tion, each node also contains information on the cor-

responding dissonance sequence. This dissonance se-

quence is scheduled for playback along with all other

entries of the node but a special ﬂag informs the MIDI

scheduler to set its volume based on up-to-date infor-

mation from the corresponding metric monitor.

4.2 Limitation and Discussion

In terms of control targets, we currently support only

the volume changes for dissonances and play MIDI

signals. The same principle (and much of the proto-

type’s code), though, could be used for WAVE-based

signals. Additionally, using any of the discussed

MIDI-based control targets (e.g., instrument changes,

etc. – see sect. 3) should not require adding more than

a few lines of code to our implementation so that we

consider AudioCues a rather complete instantiation of

MultiSense for MIDI-based control targets.

Our implementation comprises all components

listed in ﬁgure 1 apart from the pub/sub system. The

prototype currently polls input data from Amazon

CloudWatch or a local generator component for test-

ing purposes; other monitoring solutions like Ganglia

or Nagios could be added by simply implementing a

client stub. We currently have implemented two met-

ric analyzers that are parameterized with maximum

and minimum values which are mapped 0 or 100 re-

spectively and interpolate linearly or quadratically.

The main disadvantage we see in AudioCues is

that building the music graphs necessary for a full

-20

100

120

100

150

200

250

300

350

400

450

Time%[s]

Metric

Score

Figure 3: Generated Metric Data and Resulting Score used

in our Evaluation.

workday, so as to avoid boredom and fatigue, is non-

trivial and time-consuming. Using one of the audio

stream-based control targets instead might be a more

feasible approach. Still, we found the event-based in-

teraction model of MIDI rather convenient and ﬂex-

ible to use so that it is deﬁnitely worth a thought to

(mis)use MIDI as communication protocol for non-

MIDI control targets as well. Another solution could

be to analyze existing MIDI ﬁles for repetitive se-

quences and to generate the necessary instructions for

building a music graph from that information. Auto-

matically generating the dissonances track would be

relatively easy based on such an analyzed MIDI ﬁle.

Furthermore, the system may need to be calibrated

individually – i.e., it is likely to depend on the indi-

viduals where their personal thresholds of awareness

lie. To our knowledge this has not been answered in a

general way yet (Ishii et al., 1998) and, thus, requires

individual adjustments for a production system.

Finally, the metric analyzer components require

some insight into “normal” system states, i.e., what

are the values that correspond to scores of zero and

one hundred for each metric respectively. An estimate

for what is normal can be obtained through bench-

marking, e.g.,(Cooper et al., 2010; Bermbach and Tai,

2014; M

uller et al., 2014), but obviously adds further

effort to the system setup.

5 EVALUATION

Our evaluation comprises three parts: First, we have

implemented AudioCues as a proof-of-concept and

published its code as open source (sect. 5.1). Sec-

ond, we have recorded a sample output of this sys-

tem and published the recording on SoundCloud and

YouTube, so that readers can easily verify themselves

that AudioCues in fact works (sect. 5.2). Third, we

asked a number of people to listen to that recording

and answer a few questions based on that to gain in-

sight into their perception of dissonances while work-

ing on a primary task (sect. 5.3).

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

442

5.1 Proof-of-Concept Implementation

We implemented AudioCues as a research prototype

and made it available as open source

. Our proto-

type is implemented in Java 8, uses the AWS SDK

to connect to CloudWatch, and creates MIDI signals

through standard Java. AudioCues also plots each

observed metric over time using JFreeChart

. This

proof-of-concept implementation shows that it is in-

deed possible to create varying MIDI output based on

the current system state.

5.2 Recorded Output

To create a deterministic sample music output, we

have implemented a music graph where every node

has exactly one follower. Depending on current mon-

itoring state, the sample music adds tritone disso-

nances in different volumes which are easy to notice.

For the recording, we routed the MIDI output signals

of our tool through VST instruments of Cubase 5

We had already seen in test runs that our sys-

tem works with CloudWatch. To assert reproducibil-

ity in our evaluation, we thus created a stream of

cloud monitoring data artiﬁcially with metric values

between −10 and 110 (lower values are better as, e.g.,

in the case latency). These fake metric values were

then translated into normalized scores through linear

interpolation: 100 was mapped to a score of 100, −10

to a score of 0. Fig. 3 shows the created metric stream

and the resulting scores which we used for our eval-

uation. You can ﬁnd a video with a recording of our

sample music graph based on the metric input from

ﬁgure 3 and the corresponding chart on YouTube

The interested reader can use the video to easily

verify that the output of our research prototype is able

to convey different system states through varying de-

grees of dissonance in its audio output.

5.3 Empirical Study

We did a small empirical study to better understand

whether AudioCues works not only for the authors but

also for other people, and whether the degree of con-

centration of people indeed inﬂuences how fast they

become aware of the dissonances in our test music.

For this purpose, we asked a group of 18 col-

leagues to complete a questionnaire on the audio ﬁle

from sect. 5.2. Speciﬁcally, test persons were given

github.com/dbermbach/audiocues

www.jfree.org/jfreechart

www.steinberg.de/en/products/cubase/start.html

youtu.be/gWJtGZOp3K0

the following instructions: Read some basic informa-

tion on the goal of our research project. Start working

on something else while playing the audio ﬁle. When-

ever you notice a change in the audio signal, answer

the following three questions: (i) When you noticed

the change, what was your estimated level of concen-

tration on a scale from one (not concentrated at all) to

ten (fully concentrated on something else), (ii) when

did you ﬁrst notice the change, and (iii) when, in your

opinion, is the system again in a stable state? Test

persons were allowed to pause playback but were ex-

plicitly asked not to fast-forward or rewind the audio

ﬁle. We also chose not to tell our test persons that

there would be three “events” in the audio ﬁle.

Afterwards, we – who were familiar with the au-

dio ﬁle and the shape of ﬁgure 3 – also listened to

the audio ﬁle to identify the periods during which the

dissonances were hearable. We identiﬁed the disso-

nance intervals [25; 97], [183;281], and [368; 451] –

each in seconds of playback. We did this to calcu-

late the “awareness delay”, i.e., the time between the

interval start and the value reported by a test person.

Expected Results. We expected that sudden

changes (e.g., the ﬁrst event in out test) will have

a lower awareness delay than slower, continuous

changes (e.g., the last two events in our test). We

also expected that the awareness delay for continu-

ous changes highly increases with the test persons’s

level of concentration

and that the awareness delay

for sudden changes also increases with the test per-

son’s level of concentration but is affected less than in

the case of continuous changes.

Actual Results. In our study, we could see our

expected results: Indeed, sudden changes had a lower

awareness delay than slower, continuous changes.

The ﬁrst event had an average awareness delay of 5.5

seconds, whereas the second and third event had an

average awareness delay of 15.7 seconds. Here, we

see the only point for critique in the small sample size

(18 test persons). Regarding our other expectations,

the awareness delay indeed increased with the test

person’s level of concentration on some other task.

We cannot (and do not intend to) quantify this effect

but the results indicate that this seems to be the case.

Figures 4 and 5 show the awareness delay as a func-

tion of the individually estimated level of concentra-

tion for sudden changes (ﬁg. 4) and slow, continuous

changes (ﬁg. 5). We believe that the results are valid

even though the level of concentration was individ-

ually estimated since both full concentration and no

To avoid confusion: A high level of concentration

means that the test person is paying no attention to our au-

dio signal, instead devoting her full attention to some other

task.

Audio-visual Cues for Cloud Service Monitoring

443

concentration are individual values as well. This also

conforms to the ﬁndings of (Vickers, 2011).

All in all, these ﬁndings indicate that AudioCues

in fact works: Depending on the individual level of

concentration, people become aware of changes in

dissonance volumes after different awareness delays

and sudden changes create instant awareness. We be-

lieve that this is very useful in practice as DevOps

engineers will in periods of high concentration only

be interrupted by important events (i.e., sudden or ex-

tensive changes) whereas they are likely to instantly

become aware of small changes during periods of low

concentration. This, of course, needs to further analy-

sis which, however, is beyond the scope of this paper.

# $ % & " ' ( ) * #!

+,--./0 +12314 56/.2 3073./-

Concentration

Time [s]

Figure 4: Awareness Delay as a Function of Concentration:

Sudden Changes.

1 2 3 4 5 6 7 8 9 10

Slow/ Starts Linea r/Trend

Concentration

Time [s]

Figure 5: Awareness Delay as a Function of Concentration:

Slow, Continuous Changes.

6 RELATED WORK

In this section, we will discuss related work starting

with approaches that directly relate to AudioCues be-

fore broadening the scope of the discussion.

AudioCues and Soniﬁcation: Soniﬁcation-based

process monitoring is used in many areas of appli-

cation, e.g., health care, industrial plants, environ-

mental awareness, or home monitoring (Fitch and

Kramer, 1994; Gaver et al., 1991; Rauterberg and

Styger, 1994; Hermann et al., 2003; Bakker et al.,

2010; Schmandt and Vallejo, 2003; Tran and Mynatt,

2000). In the early days of computing, auditory out-

put of computers has also been used for monitoring of

CPU activity (Vickers and Alty, 2003).

Later, some efforts were made to sonify log

data (Dzielak, 2014; Tarbox, 2008) and to enhance

debugging through an acoustic component (Jame-

son, 1994b; Jameson, 1994a; Finlayson and Mellish,

2005). These systems all transform static ﬁles, e.g.,

logs or source code, into a sequence of sounds which

does not qualify as background music.

Other approaches focus on live soniﬁcation of

data streams comparable to our MultiSense and Au-

dioCues: The ShareMon system (Cohen, 1994) raises

awareness of ﬁle sharing by using audio to notify

users of related events. The Peep system (Gilﬁx and

Couch, 2000) plays natural sounds to sonify network

state; this enables peripheral real-time network mon-

itoring. In the context of web servers, others propose

model-based soniﬁcation of HTTP requests using the

SuperCollider programming language (Ballora et al.,

2010; Hermann and Ritter, 1999) or playback user-

deﬁned sound effects to notify web site creators of

visitors (Liechti et al., 1999). All of these approaches

trigger discrete sounds based on discrete input events.

They are disruptive due to their event-based character

and do not support a continuous output “scale”.

More closely related to our approach is Web-

Melody (Barra et al., 2001) which proposes to com-

bine sounds triggered by web server events with user-

selected music to support peripheral monitoring with-

out fatigue. Unlike AudioCues, however, which trans-

lates continuous changes in a data stream into grad-

ual changes of music, the WebMelody system trig-

gers predeﬁned binary sound events and plays them

along with the music stream. WebMelody could en-

hance AudioCues by adding highly noticeable binary

alarms for select critical system states.

All in all, while there are alternative approaches

proposing auditory displays for system monitoring,

neither of these systems uses similar soniﬁcation tech-

niques as AudioCues or offers non-binary output.

Furthermore, all other approaches target a very spe-

ciﬁc data source for soniﬁcation and are, thus, miss-

ing the broader scope proposed through MultiSense.

MultiSense and Peripheral Displays: To our

knowledge, there is no broad platform comparable to

MultiSense

– neither as architectural concept and

framework as in MultiSense nor as a prototypical im-

plementation. However, there is some work on pe-

ripheral visual displays which have been proposed to

enable monitoring for a particular use case,e.g., Live

Wire (Weiser and Brown, 1997), Waterland, or Pin-

wheels (Dahley et al., 1998). These could be used as

alternative control targets in MultiSense.

Miscellaneous: There is a lot of ongoing research

on cloud monitoring; the focus, however, seems to be

on collection and analysis of data rather than on pre-

Early ideas on the MultiSense architecture have been

published as a poster (Bermbach and Eberhardt, 2016).

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

444

senting the information to developers, e.g., (Kanstr

et al., 2015; Bermbach and Tai, 2014; Alcaraz Calero

and Gutierrez Aguado, 2015). There is a plethora of

real-time dashboards, e.g., Grafana

, which require

the user’s attention to observe the graphically pre-

sented data. To our knowledge, there is no dashboard

yet that considers using the subconscious or periph-

eral perception to transmit information to the user.

An alternative to MultiSense are systems that au-

tonomously decide when a alarm should be raised.

Threshold-based approaches, e.g., Amazon Au-

toScale, notify or take action as soon as a metric ex-

ceeds a speciﬁed threshold value. More advanced

techniques, e.g., based on machine learning (Islam

et al., 2012) or linear prediction models (Dinda and

O’Hallaron, 2000), reduce false positive or negative

alarms and are better equipped to deal with random

ﬂuctuations around a static threshold. However, all

these approaches still force the developer to trade off

between timeliness of an alarm against the rate of

false negatives due to the binary nature of alarms.

Benchmarking, e.g., (Cooper et al., 2010;

Bermbach and Tai, 2014; M

uller et al., 2014;

Bermbach and Wittern, 2016), can quantify quality of

a cloud system before deployment of an application.

While it cannot replace monitoring, it may be useful

for calibrating MultiSense and AudioCues.

7 CONCLUSION

In this paper, we have proposed a framework and ap-

proach that leverages the ability of the human sub-

conscious to detect deviations from a “normal” state.

To reach this goal, we use cloud monitoring data to

control various aspects of developers’ environments,

thus, enabling them to subconsciously become aware

of faulty system states, e.g., through color changes

in the ambient lighting or dissonances in the music

output. Existing approaches, in contrast, could only

express binary state changes (alarms) or required de-

velopers to dedicate their full attention to observing

monitoring data.

For this purpose, we started with a discussion of

the foundations of soniﬁcation, i.e., the process of

translating input data into audible signals, and came

to the conclusion that music is, indeed, a very able

transport medium for our purposes. Next, we intro-

duced MultiSense, a high-level architecture concept

and framework for manipulating various control tar-

gets in the surrounding environment of developers.

We also discussed three groups of different control

grafana.org

targets – hearing, sight, and feeling – and the differ-

ent ways in which these can be used to transmit a con-

tinuous stream of information on various cloud mon-

itoring metrics to a DevOps engineer. As the generic

MultiSense is, so far, only an architectural concept

lacking an implementation, we then presented Au-

dioCues. AudioCues is an instantiation of MultiSense

and continuously changes music output based on the

current state of cloud services. We also discussed the

AudioCues prototype which adds dissonances in dif-

ferent volumes to the music output depending on, e.g.,

Amazon CloudWatch data.

Afterwards, we evaluated MultiSense and, espe-

cially, its instantiation AudioCues in three different

ways: First, we demonstrated through our proof-of-

concept implementation that it is indeed possible to

build a system that adds dissonances in different vol-

umes to music output based on monitoring data. This

prototype is available on GitHub. Second, we cre-

ated a sample music track and had it played by our

prototype. Based on artiﬁcially injected monitoring

data, we recorded this output for readers to verify

themselves that they can distinguish different volume

settings for the dissonance track and, thus, gradually

become aware of changes in the system state. This

recording is available as an audio ﬁle on SoundCloud

and as a video on YouTube. Third, we asked a num-

ber of test persons to listen to this sample recording

while working on something else. Our results show

that people notice dissonances, albeit later if they are

fully concentrated on another task. This demonstrates

that our overall approach works: DevOps engineers

can use AudioCues for background monitoring and

will become aware of severe changes right away while

noticing minor changes only in periods of low con-

centration. Finally, we discussed a comprehensive

list of related approaches, all of which support only

binary event-based output and mostly come from dif-

ferent application domains.

In future work, we aim to extend our AudioCues

prototype to manage additional control targets which

more and more become available through IoT.

REFERENCES

Alcaraz Calero, J. and Gutierrez Aguado, J. (2015). Mon-

paas: An adaptive monitoring platformas a service for

cloud computing infrastructures and services. IEEE

TSC.

Bakker, S., Van Den Hoven, E., and Eggen, B. (2010). Ex-

ploring interactive systems using peripheral sounds.

In Haptic and Audio Interaction Design. Springer.

Ballora, M., Panulla, B., Gourley, M., and Hall, D. L.

(2010). Preliminary steps in sonifying web log data.

Audio-visual Cues for Cloud Service Monitoring

445

International Community for Auditory Display, Wash-

ington, DC.

Barra, M., Cillo, T., De Santis, A., Petrillo, U. F., Negro,

A., Scarano, V., Matlock, T., and Maglio, P. P. (2001).

Personal webmelody: Customized soniﬁcation of web

servers. Proc. of ICAD 2001.

Bass, L., Weber, I., and Zhu, L. (2015). DevOps: A Soft-

ware Architect’s Perspective. Addison-Wesley Profes-

sional.

Bermbach, D. and Eberhardt, J. (2016). Towards audio-

visual cues for cloud infrastructure monitoring. In

Proc. of IC2E 2016. IEEE.

Bermbach, D. and Tai, S. (2014). Benchmarking eventual

consistency: Lessons learned from long-term experi-

mental studies. In Proc. of IC2E 2014. IEEE.

Bermbach, D. and Wittern, E. (2016). Benchmarking web

api quality. In Proc. of ICWE. Springer.

Cohen, J. (1994). Monitoring background activities. In

Santa Fe Institute Studies in the Sciences of Complex-

ity Proceedings. Addison-Wesley.

Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan, R.,

and Sears, R. (2010). Benchmarking cloud serving

systems with ycsb. In Proc. of SOCC, SOCC ’10.

ACM.

Dahley, A., Wisneski, C., and Ishii, H. (1998). Water lamp

and pinwheels: ambient projection of digital infor-

mation into architectural space. In Proc. of CHI 98.

ACM.

Dinda, P. A. and O’Hallaron, D. R. (2000). Host load

prediction using linear models. Cluster Computing,

3(4):265–280.

Dzielak, J. (2014). Listen to your log ﬁles with aud.

http://dzello.com/blog/2014/01/10/

listen-to-your-log-files-with-aud/

(accessed Sept 11,2015).

Finlayson, J. L. and Mellish, C. (2005). The audioview-

providing a glance at java source code. In Proc. of

ICAD.

Fitch, W. T. and Kramer, G. (1994). Sonifying the body

electric: Superiority of an auditory over a visual dis-

play in a complex, multivariate system. In Santa Fe

Institute Studies in the Sciences of Complexity Pro-

ceedings. Addison-Wesley.

Gaver, W. W., Smith, R. B., and O’Shea, T. (1991). Effec-

tive sounds in complex systems: The arkola simula-

tion. In Proc. of CHI. ACM.

Gilﬁx, M. and Couch, A. L. (2000). Peep (the network

auralizer): Monitoring your network with sound. In

LISA, pages 109–117.

Hermann, T., Drees, J. M., and Ritter, H. (2003). Broad-

casting auditory weather reports-a pilot project. Proc.

of ICAD.

Hermann, T. and Ritter, H. (1999). Listen to your data:

Model-based soniﬁcation for data analysis. Advances

in intelligent computing and multimedia systems.

Ishii, H., Wisneski, C., Brave, S., Dahley, A., Gorbet, M.,

Ullmer, B., and Yarin, P. (1998). ambientroom: in-

tegrating ambient media with architectural space. In

Proc. of CHI 98, pages 173–174. ACM.

Islam, S., Keung, J., Lee, K., and Liu, A. (2012). Empirical

prediction models for adaptive resource provisioning

in the cloud. Future Generation Computer Systems.

Jameson, D. H. (1994a). The run-time components of son-

nett. Proc. of ICAD 1994.

Jameson, D. H. (1994b). Sonnet: Audio-enhanced moni-

toring and debugging. In Santa Fe Institute Studies

in the Sciences of Complexity Proceedings. Addison-

Wesley.

Jenkins, J. J. (1985). Acoustic information for objects,

places, and events. Persistence and change, pages

115–138.

Kanstr

en, T., Lehtonen, S., Savola, R., Kukkohovi, H., and

onen, K. (2015). Architecture for high conﬁdence

cloud security monitoring. In Proc. of IC2E 2014.

IEEE.

Liechti, O., Sifer, M., and Ichikawa, T. (1999). A non-

obtrusive user interface for increasing social aware-

ness on the world wide web. Personal Technologies,

3(1-2):22–32.

Lucas, P. A. (1994). An evaluation of the communicative

ability of auditory icons and earcons. Proc. of ICAD

1994.

Maglio, P. P. and Campbell, C. S. (2000). Tradeoffs in

displaying peripheral information. In Proc. of CHI.

ACM.

uller, S., Bermbach, D., Tai, S., and Pallas, F. (2014).

Benchmarking the performance impact of transport

layer security in cloud database systems. In Proc. of

IC2E 2014. IEEE.

Mynatt, E. D., Back, M., Want, R., Baer, M., and Ellis, J. B.

(1998). Designing audio aura. In Proc. of CHI 1998,

pages 566–573. ACM.

Rauterberg, M. and Styger, E. (1994). Positive effects of

sound feedback during the operation of a plant simu-

lator. In Human-Computer Interaction, pages 35–44.

Springer.

Schmandt, C. and Vallejo, G. (2003). “listening” to do-

mestic environments from remote locations. Proc. of

ICAD.

Tarbox, B. (2008). Log4jfugue. www.log4jfugue.org

(accessed Sept 11,2015).

Tran, Q. T. and Mynatt, E. D. (2000). Music monitor: Am-

bient musical data for the home. Extended Proceed-

ings of the HOIT, pages 85–92.

Vickers, P. (2011). Soniﬁcation for process monitoring. In

Hermann, T., Hunt, A., and Neuhoff, J., editors, The

Soniﬁcation Handbook, pages 455–492. Logos Ver-

lag, Berlin.

Vickers, P. and Alty, J. L. (2003). Siren songs and swan

songs debugging with music. Communications of the

ACM, 46(7):86–93.

Weiser, M. and Brown, J. S. (1997). The coming age of

calm technology. In Beyond calculation, pages 75–

85. Springer.

CLOSER 2017 - 7th International Conference on Cloud Computing and Services Science

446