data. For example, NASA continuously receives data
from space probes. Earthquake and weather sensors
produce data streams as do web sites and telephone
systems.
In this paper we investigate issues related to mem-
ory management that need to be addressed for large
scale data stream recorders (Zimmermann et al.,
2003). After introducing some of the related work in
Section 2 we present a memory management model
in Section 3. We formalize the model and compute its
complexity in Section 4. We prove that because of a
combination of a large number of system parameters
and user service requirements the problem is expo-
nentially hard. Conclusions and future work are con-
tained in Section 5.
2 RELATED WORK
Managing the available main memory efficiently is
a crucial aspect of any multimedia streaming sys-
tem. A number of studies have investigated buffer
and cache management. These techniques can be
classified into three groups: (1) server buffer man-
agement (Makaroff and Ng, 1995; Shi and Ghande-
harizadeh, 1997; Tsai and Lee, 1998; Tsai and Lee,
1999; Lee et al., 2001), (2) network/proxy cache man-
agement (Sen et al., 1999; Ramesh et al., 2001; Chae
et al., 2002; Cui and Nahrstedt, 2003) and (3) client
buffer management (Shahabi and Alshayeji, 2000;
Waldvogel et al., 2003). Figure 1 illustrates where
memory resources are located in a distributed envi-
ronment.
In this report we aim to optimize the usage of server
buffers in a large scale data stream recording system.
This focus falls naturally into the first category clas-
sified above. To the best of our knowledge, no prior
work has investigated this issue in the context of the
design of a large scale, unified architecture, which
considers both retrieving and recording streams si-
multaneously.
3 MEMORY MANAGEMENT
OVERVIEW
A streaming media system requires main memory to
temporarily hold data items while they are transferred
between the network and the permanent disk storage.
For efficiency reasons, network packets are generally
much smaller than disk blocks. The assembly of in-
coming packets into data blocks and conversely the
partitioning of blocks into outgoing packets requires
main memory buffers. A widely used solution in
servers is double buffering. For example, one buffer
Table 1: Parameters for a current high performance com-
mercial disk drive.
Model ST336752LC
Series Cheetah X15
Manufacturer Seagate Technology, LLC
Capacity C 37 GB
Transfer rate R
D
See Figure 2
Spindle speed 15,000 rpm
Avg. rotational latency 2 msec
Worst case seek time ≈ 7 msec
Number of Zones Z 9
is filled with a data block that is coming from a disk
drive while the content of the second buffer is emp-
tied (i.e., streamed out) over the network. Once the
buffers are full/empty, their roles are reversed.
With a stream recorder, double buffering is still the
minimum that is required. With additional buffers
available, incoming data can be held in memory
longer and the deadline by which a data block must
be written to disk can be extended. This can reduce
disk contention and hence the probability of missed
deadlines (Aref et al., 1997). However, in our in-
vestigation we are foremost interested in the minimal
amount of memory that is necessary for a given work-
load and service level. Hence, we assume a double
buffering scheme as the basis for our analysis. In a
large scale stream recorder the number of streams to
be retrieved versus the number to be recorded may
vary significantly over time. Furthermore, the write
performance of a disk is usually significantly less than
its read bandwidth (see Figure 2b). Hence, these fac-
tors need to be considered and incorporated into the
memory model.
When designing an efficient memory buffer man-
agement module for a data stream recorder, one can
classify the interesting problems into two categories:
(1) resource configuration and (2) performance opti-
mization.
In the resource configuration category, a represen-
tative class of problems are: What is the minimum
memory or buffer size that is needed to satisfy certain
playback and recording service requirements? These
requirements depend on the higher level QoS require-
ments imposed by the end user or application envi-
ronment.
In the performance optimization category, a repre-
sentative class of problems are: Given certain amount
of memory or buffer, how to maximize our system per-
formance in terms of certain performance metrics?
Two typical performance metrics are as follows:
i Maximize the total number of supportable streams.
ii Maximize the disk I/O parallelism, i.e., minimize
the total number of parallel disk I/Os.
MEMORY MANAGEMENT FOR LARGE SCALE DATA STREAM RECORDERS
55