case) each cache frame can keep a DB page which,
in turn, can be mapped to a disk block. In most
DBMSs, propagation of DB pages follows the
update-in-place principle applying WAL (Gray and
Reuter, 1993).
Flash storage is divided into m equal blocks typi-
cally much larger than DB pages. A flash block nor-
mally contains b (32 – 128) fixed-size pages where a
page ranges between 512B and 2KB. Because zeros
cannot be directly written to a page, one must erase
(reset) the block to all 1’s, before a page can be writ-
ten. Thus, a written page cannot be updated
anymore, but only freshly written after the entire
block is erased again. Hence, the block is the unit of
erasure automatically done by the flash device when
allocating an empty block. The page is the smallest
and the block the largest unit of read whereas the
page is the unit of write; using chained IO, the
DBMS, however, can write 1 < i ≤ b pages into a
block at a time. Note, whenever a page is written in-
place, the flash device automatically allocates a new
block and moves to it all pages from the old block
together with the updated page (keeping a cluster
property). This wear leveling (Ban, 2004) is entirely
transparent to the client, i.e., the DBMS, such that
all references to displaced pages, e.g., index pointers
and other links, remain valid.
Another concern called write endurance and of-
ten cited in the literature is the limited number of
erase cycles, between 100,000 (older references) and
5,000,000 (most recent references). When a block
reaches this erase cycle limit, it cannot be longer
used and has to be marked as corrupted. Hence,
management of flash relies on a pool of spare
blocks; due to the application of wear leveling
overly frequent overwriting of the same block is
avoided.
2.2 Flash Potential
To gain a deeper and more complete comparative
picture of disk and flash, we want to outline the
differences and advantages for IO, power, size, and
price of both device types and indicate where drastic
processing improvements and costs can be
anticipated. Here, we can only summarize the
evaluation of others (Gray and Fitzgerald, 2007,
Nath and Kansal, 2007) in a coarse way and give
indicative numbers or orders of magnitude of gains
or degradations. Of course, this discussion assumes
that DBMS algorithms provide adjusted mappings to
take full advantage of the flash potential. For the
performance figures, we assume what technology
currently provides for fast disks (e.g., SCSI 15k
rpm) and flash (SAMSUNG 2008).
IO performance: We distinguish different forms
of IO processing: Sequential IO continuously reads/
writes blocks to the device whereas random IO can
be directly performed to/from any given block
address.
Sequential reads and writes on flash having a
bandwidth of ~90 MBps are comparable to those on
fast disks.
• Sequential reads and writes on flash having a
bandwidth of ~90 MBps are comparable to
those on fast disks.
• Random reads on flash are spectacularly faster
by a factor of 10–15 (2800 IOps compared to
<200 IOps).
• Random writes, requiring block erasure first,
perform worst with ~27 IOps and are slower
by a factor of 4–8 compared to disks.
Hence, dramatic bandwidth gains are obtained
for random reads while random writes are
problematic and have to be algorithmically
addressed at the DBMS side.
Energy consumption: The power needed to drive
a flash read/write is 0.9 Watt and, hence, by a factor
of >15 lower than for a disk. Using the figures for
IOps, we can compute IOps/Watt as another
indicator for energy-saving potential. Hence, 3,100
flash-reads and 30 flash-writes can be achieved per
Watt, whereas a disk only reaches 13 operations per
Watt.
Unit size: Starting in 1996, NAND flash chips
doubled their densities each year and currently pro-
vide 64 Gbit. According to Hwang (2006), this
growth will continue or accelerate such that 256
GByte per chip are available in 2012. Because sever-
al of these chips can be packaged as a “disk”, the
DB community should be prepared for flash drives
with terabyte capacity in this near future. Hwang
(2006) also expects the advent of 20 TByte flash
devices in 2015. Of course, disks will also reach a
comparable capacity, but flash drives will provide
further important properties, as outlined in Section 1.
Price per unit: Today, flash is quite expensive. A
GByte of flash memory amounts to 20$, but technol-
ogy forecast predicts a dramatic decrease to only 2$/
GByte in the near future. Therefore, Gray and
Fitzgerald (2007) expect that disk and flash of
comparable capacity will have roughly the same
price (e.g., ~500$ for an SCSI and ~400$ for a flash
drive). This assumption allows us to compute IOps/$
as an additional measure of comparison. While
flash-read gets 7.0 IOps/$, flash-write gets poor 0.07
and an SCSI operation 0.5 IOps/$. Hence, using
ICEIS 2008 - International Conference on Enterprise Information Systems
536