virtual orchestra". The aim of this project is to
provide means for musicians to play across the
Internet in real time (see Figure 1)
(Bouillot N., 2003)
(Locher H.-N. et al., 2003).
The application constraints are as follows:
- the musicians are physically separated but must
play virtually "together" in real time.
- the sound engineer must be able to adjust in real
time the audio parameters of the various sound
sources (e.g., to add reverberation effects, etc).
- the public must be able to virtually attend the
concert, either at home by a standard mechanism of
audio/video streaming, or in a room with a dedicated
installation.
In this paper, we are interested in the part
concerning the musicians only, since it is a critical
part in term of interactivity. Our application uses
jMax, a visual programming environment dedicated
to interactive real-time music and multimedia
applications
(Déchelle F., 2000). jMax has been
developed by the IRCAM. It is composed of two
parts: FTS for "faster than sound", a real-time sound
processing engine and a graphical user interface
which allows to add, remove or connect components
that exchange audio samples or discrete values. Some
examples of components available in jMax are the
inputs/outputs of the sound device, the arithmetic
operations and the digital audio filters. Since jMax is
often used to make audio synthesis, it has an
interface with the operating system, ALSA (for
Advanced Linux Sound Architecture) for Linux.
As we are close to virtual-reality conditions, the
sound quality, the feeling of presence, as well as
synchronism among musicians are crucial conditions.
For this reason, the technology developed for the
distributed concert uses a non compressed sound
(PCM samples at 44100Hz 16 bit, corresponding to
the quality of an audio CD). Additionally we use the
multicast with the RTP protocol
(Schulzrinne et al.,
1998)
for the communication among musicians. For
the feeling of presence, during our experiments, a
videoconference software allowed remote
visualisation among the musicians.
Usually, the musical interaction (all musicians in
the same room) is enabled thanks to a common
perception among musicians: the sound and the
visual events are perceived instantaneously,
simultaneously and with a sound quality limited by
the capacities of the human ears and eyes.
During networked performances, we can provide
the better quality for the sound, but we cannot
provide instantaneity. In fact, we estimate that 20ms
is the threshold above which the human ear perceives
the shifts. For this reason, we ensure a global
simultaneity among musicians thanks to a
synchronization mechanism described in
(Bouillot N.,
2003)
(Nicolas Bouillot N. et Gressier-Soudan E., 2004)
and implemented inside nJam (for network Jam), a
pluggin of jMax. This synchronization ensures that
the return of the overall mix of the music is identical
for all the musicians. nJam computes the diffusion of
the sound through multicast with RTP, the
synchronization of the audio streams, and the shift
between the musicians. Thus, the musicians specify a
tempo, as well as the shift in musical units (a beat, an
eighth note, a sixteenth note, etc). This parameter
enables them to have a shifted feedback, which is
synchronized and matches the beats of the music they
are playing on their instrument.
We can extract some constraints for the operating
system and the network: each instance of nJam will
send on the network only one audio stream and will
receive N from them (if N sites are involved). Then,
FTS manages at least one component corresponding
to an input (microphone or instrument) and one
component for each output of the sound device.
Within the RTP protocol, the isochronism of the
audio data is ensured by a time-stamping that
corresponds to the number of audio samples.
Additionally, each site is controlled by its own clock,
that came from the local sound device. At each site-
clock tick, a sample of 16 bits is produced and a
sample coming from each source is consumed. Thus,
the constraints of end-to-end temporal delivery are
crucial, either for the network part or for the system
part FTS/ALSA.
In this paper, we focus on the schedule of various
application components by using a real-time
scheduling technique.
3 BOSSA
The distributed virtual orchestra is an application
written in C whose execution environment is the
Linux system. From a system point of view, this
application is confronted with the resource sharing,
in particular in terms of access to the processor and
to the peripherals (network device and sound
device). However, a guaranteed periodic access to
these resources is necessary. In this context, the use
of a real-time Linux system (such as RT-Linux
2
or
RTAI) would enable us to try out scheduling policies
not available on the traditional Linux system.
Nevertheless, to profit from these policies, the target
application must respect the structure of the real-
time tasks and include the particular function calls of
the library of real-time Linux. To avoid modifying
the source code that deals with the logic of the
application, we choose to use rather Bossa. Indeed,
to our knowledge, it is the only platform which can
2
http://www.fsmlabs.com
PERFORMING REAL-TIME SCHEDULING IN AN INTERACTIVE AUDIO-STREAMING APPLICATION
141