proposed method is significant. The proposed method
executes the benchmark from 2.39x to 4.60x faster
than SSD. The difference between the proposed
method and SSD is the largest when the combination
of the O DIRECT Innodb flush method and the KVM
directsync cache mode is used. Since this combina-
tion provides no buffering of data transfer in the OS
kernel and the KVM virtualization software, the cost
to write data in storage becomes the maximum among
the combinations used for the experiments. The other
combinations provide buffering somewhere in the OS
kernel and the KVM virtualization software; thus, the
differences are closer but still large, which are from
2.39x to 2.62x.
5 RELATED WORK
A technique to combine block storage with another
block storage for higher access performance existed
before SSDs become widely available and popular.
DCD (Hu and Yang, 1996) first stores data in cache
storage, so that it can make use of sequential access,
of which performance is typically much better than
random access, so that the write performance can be
improved. The emergence of SSDs stimulated the
research and development of various caching tech-
niques (Kgil and Mudge, 2006; Koller et al., 2013;
Saxena et al., 2012; Facebook, 2014) in order to
make use of their high performance. Because SSDs
are block storage, all of them combine block stor-
age with another block storage, and provide the block
storage interface. The proposed method is different
from them since it combines memory storage with
block storage. Because memory storage allows syn-
chronous access, the proposed method aggressively
makes use of it in order to reduce the access cost in
total.
The Linux kernel provides the device mapper as
the software layer to combine multiple storage de-
vices and to constitutes a single storage device. When
the device mapper is used to combine memory stor-
age with block storage, it requires memory storage
to emulate block storage since the device mapper can
interact only with the block storage interface. It also
causes significant software overhead since the access
requests can be processed by the generic block device
driver framework multiple times(Ueda et al., 2007).
The proposed method does not use the device mapper
mechanism in order to avoid such overheads, and im-
plements its own function that can provide the direct
and synchronous access interface to memory storage.
6 SUMMARY
Memory storage technologies are emerging, and they
should be effectively utilized in cloud computing en-
vironments in order accelerate storage performance
for big data processing. This paper proposed a
method that combines block storage with memory
storage and makes use of memory storage as cache of
block storage in order to remove such limitation. The
proposed method effectively utilizes the high perfor-
mance of memory storage and also provides the large
capacity of block storage. Therefore, memory storage
can be transparently used as a part of mass storage
while its low overhead access can accelerate storage
performance. The proposed method was implemented
as a device driver of the Linux kernel. Its performance
evaluation shows that it outperforms a bare SSD drive
and provides better performance on the Hadoop and
database environments.
REFERENCES
Facebook (2014). Flashcache. https://github.com/facebook/
flashcache.
Hu, Y. and Yang, Q. (1996). Dcd – disk caching disk: A
new approach for boosting i/o performance. In Pro-
ceedings of the 23rd Annual International Symposium
on Computer Architecture, pages 169–178.
Kgil, T. and Mudge, T. (2006). Flashcache: A nand flash
memory file cache for low power web servers. In
Proceedings of the 2006 International Conference on
Compilers, Architecture and Synthesis for Embedded
Systems, CASES ’06, pages 103–112, New York, NY,
USA. ACM.
Koller, R., Marmol, L., Rangaswami, R., Sundararaman,
S., Talagala, N., and Zhao, M. (2013). Write poli-
cies for host-side flash caches. In Proceedings of the
11th USENIX Conference on File and Storage Tech-
nologies, FAST’13, pages 45–58, Berkeley, CA, USA.
USENIX Association.
Saxena, M., Swift, M. M., and Zhang, Y. (2012). Flashtier:
A lightweight, consistent and durable storage cache.
In Proceedings of the 7th ACM European Conference
on Computer Systems, EuroSys ’12, pages 267–280,
New York, NY, USA. ACM.
Shvachko, K., Kuang, H., Radia, S., and Chansler, R.
(2010). The hadoop distributed file system. In Pro-
ceedings of the 2010 IEEE 26th Symposium on Mass
Storage Systems and Technologies (MSST), MSST
’10, pages 1–10, Washington, DC, USA. IEEE Com-
puter Society.
Ueda, K., Nomura, J., and Christie, M. (2007). Request-
based device-mapper multipath and dynamic load bal-
ancing. In Proceedings of the Linux Symposium, vol-
ume 2, pages 235–243.
CLOSER2015-5thInternationalConferenceonCloudComputingandServicesScience
534