developers can change the schema of the
transactional database with very little worry about
affecting the reporting database. This freedom is not
possible at all without the introduction of the second
database.
Another benefit of developing systems with two
databases is that since the transactions require fast
date retrieval, data insertion, and record update for
small amount of data, it may not be necessary to
employ a fully featured database from a commercial
management system (DBMS). We have participated
in successful projects that use XML documents,
MongoDB or other similar products.
MongoDB is an excellent example of an open
source document-oriented NoSQL database
management system. It provides a very different
platform than the traditional RDBMS in terms of
storing and retrieving information. For developers
familiar with the RDBMS, MongoDB provides for
indexes, dynamic queries, replication, and auto
sharding as well as the retrieving, inserting, deleting,
and updating of records. MongoDB is easy to learn
and use because it supports, comparing with a fully
functioning RDBMS such as Oregon 11g or SQL
Server 2012, a very limit set of features--not a
weakness considering its purposes.
Beside differences in terminology, one of the
main differences between a database of MongoDB
(called an instance) and a database supported by a
RDBMS is that a MongoDB collection, similar to a
table in a relational database, can hold any type of
object, basically every record can be different in
terms of format, data types of fields, and even the
number of fields (we call columns in relational
database terms). In addition, a record can contain
arrays as a field's data type. This is MongoDB's
approach of handling the traditional one-to-many
relationship between two entities; as a result, it does
not support the relational algebra's join operation
and is certainly not relational.
It is out of the scope of this paper to discuss
whether MongoDB's approach of handling one-to-
many relationship is viable or not. Still, we'd like to
mention that we do not appreciate the suggested
handling of one-to-many relationship because
MongoDB's approach, in our opinion, introduce
complexity in searching on the "many" side. We also
believe that MongoDB should consider adding
support in joining two collections because join
operations are used in retrieving data from databases
all the time. After all, we cannot put the entire
relational database into just one collection.
The benefits come from MongoDB's small set of
features is obvious. Comparing for inserting 50,000
rows, MongoDB is 100 times faster than SQL Server
2008, a popular relational database management
system produced by Microsoft (Kennedy, 2008).
Even after considering that, during the test, SQL
Server 2008 was accessed through LINQ to SQL as
reported in (Kennedy, 2008), a 100 times difference
is significant.
Retrieving data from a MongoDB is also faster
than that of SQL Server 2008. The same article
reported a SQL Server 2008 took 28 seconds to read
out 50,000 records while MongoDB used only 10.4
seconds retrieving the same 50,000 records. For
complex queries, MongoDB can complete 100,000
not so simple queries in 398 seconds. Doing the
same takes SQL Server 2008 960 seconds. All tests
described in (Kennedy, 2008) were conducted on a
Lenovo T61 64-bit with a dual-core 2.8 GHz
processor. The OS is on Windows 7, and all DBMS
are 64-bit ones. We are in the process of conducting
our own performance test and expect to provide our
findings in our final version of the paper. After
reading the detailed experiments given in (Kennedy,
2008), we believe SQL Server 2008's performance
can be improved if proper indexes were added.
Note that, our experience shows that, for projects
following the Agile software development
methodology, schema changes on the transactional
database can be frequent during the development
phase, especially during the early stage of the
development phase. Once a large number of features
have been implemented, the scheme becomes stable.
Once it is determined that the reporting related
features is supported by a second database, the
actual design of the reporting database can be
pushed until the transactional database is relatively
stable. As the result of separated databases, the
reporting database not only is designed with a
mutual understanding of its source database, but also
is built on a more stable schema.
With proper design and architecting and
adopting of the Layers of Data Abstraction concept,
the reporting systems see the databases through
external views, which enjoy immunity of changes in
conceptual schema such as adding columns, tables,
indexes, and views. It is this conceptual model
generally needs to reflect changes in the schema
changes in transactional database.
With increase in popularity on deploying
customer facing application on the Cloud, separating
the transactional database with the reporting
database may become necessary for security reasons.
The transactional databases are generally deployed
with the application on the Cloud, which generally
means it is not on the Intranet. For most enterprise
OnDatastoreSupportforAgileSoftwareDevelopment
187