In this paper we investigate the use of a special-
ized data warehousing database management system
as a data back-end for dynamic web applications and
assess the performance of this solution. We have used
the Monet
2
(Boncz, 2002) database as a drop-in re-
placement for a traditional database, and performed
benchmarks comparing its performance to the perfor-
mance of two of the most commonly used databases
for web applications: MySQL
3
and PostgreSQL
4
.
The web application benchmark used was the indus-
try standard TPC-W (TPC, 2002), representing a typ-
ical web store. Our main contribution is to show for
the first time how a read-optimized database performs
in comparison to established general purpose DBMSs
for the domain of web applications. As more systems
are built to take advantage of the 10 fold expected per-
formance increases for some domains, we believe this
type of cross domain comparison will became a use-
ful tool for the definition of utility relations among
different database managers and application domains.
This paper is structured as follows. Section 2
discusses possible matches and mismatches of read-
optimized databases to the data storage requirements
of dynamic web applications. Section 3 describes
TPC-W and argues that it is the correct choice of
workload to fairly compare Monet, MySQL and Post-
greSQL. Section 4 describes the experimental setup
used, and Section 5 is devoted to the analysis and dis-
cussion of the results. A summary of the contributions
and final comments are contained in Section 6.
2 READ-OPTIMIZED
DATABASES FOR DYNAMIC
WEB APPLICATIONS
Initially created to implement simple, static query in-
terfaces to databases, web applications are now capa-
ble of computing their output dynamically as a func-
tion of the clients needs, querying and updating an
underlying database as necessary. Today, these appli-
cations enable all types of activities and provide ser-
vices as diverse as on-line shopping, home banking,
auctions, airline reservation, etc. In most cases, web
applications retain a property of their initial incarna-
tion as database query interfaces: the majority of ac-
cesses to the database is read-only. Current estimates
put the read/update ratio at 85% in a typical web shop
and nearly 99% in a bulletin-board application with
a large user base (Amza et al., 2002). This read-
2
http://monetdb.cwi.nl/
3
http://www.mysql.com/
4
http://www.postgresql.org/
intensive behavior is more related to OLAP work-
loads than OLTP workloads. However, until very re-
cently, no one had questioned the use of convention-
ally optimized DBMSs to deploy dynamic web appli-
cations.
Data warehousing is a booming business, al-
ready representing 1/3 of the database market in
2005 (Stonebraker et al., 2007). The basic idea is to
aggregate information from the many disjoint produc-
tion systems in a single large database, to be used for
business intelligence purposes. During the data ag-
gregation process, data is copied from write-intensive
OLTP systems to computation and read-intensive
OLAP systems, while it is filtered, re-organized and
indexed for analytical processing. A database opti-
mized for this type of workload is different from a
normal database as it is read-optimized. OLTP op-
timized databases store tables clustered by rows, be-
cause this allows a simple mapping of tables to a stor-
age model optimized for the access of complete rows.
Read-optimized databases, by contrast, typically need
to access many rows of which only a few column val-
ues are used. Thus, read-optimized databases can be
organized using vertical fragmentation, keeping data
from one or more columns together (Boncz, 2002, pp.
41-42). Data organized in columns, among others
optimizations, define a read-optimized database and
is the central focus of the two more prominent new
generation OLAP databases: Monet (Boncz, 2002;
Boncz et al., 2005) and C-Store (Stonebraker et al.,
2005).
Monet has interesting features, it was designed to
extract maximum database performance from mod-
ern hardware, especially for complex queries. It opti-
mizes the use of processor registers and cache mem-
ory through data structures optimized for main mem-
ory, vertical fragmentation and vectorized query exe-
cution. Monet is being actively developed as an open
source project, has a SQL query interface and bind-
ings to many languages such as Java, Perl and Ruby.
It is a mature project that closely matches the ex-
pected functionality of a DBMS. C-Store’s main fo-
cus is efficient
ad-hoc
read-only queries and the main
optimizations used are vertical fragmentation, care-
ful coding and packaging of objects, implementation
of non-traditional transactions and use of bitmap in-
dexes. Despite all these interesting characteristics,
C-Store is currently only a prototype with restricted
availability.
To assess the applicability of a read-optimized
database to the construction of web applications, we
decided to test one of these systems as a drop-in re-
placement for a traditional DBMS in a web applica-
tion built using the Java language and the Java Enter-
WEBIST 2008 - International Conference on Web Information Systems and Technologies
74