Performance Analysis of an Hyperconverged Infrastructure

using Docker Containers and GlusterFS

Rodrigo Leite, Priscila Solis and Eduardo Alchieri

Applied Computing Masters Program, Department of Computer Science, University of Brasilia (UnB), Bras

ılia, Brazil

Keywords:

Hyperconverged Infrastructures, GlusterFS, Containers, DFS, Cassandra, MongoDB, Docker, Cloud, Storage.

Abstract:

The adoption of hyperconverged infrastructures is a trend in datacenters, because it merges different type of

computing and storage resources. Hyperconverged infrastructures use distributed ﬁle systems (DFS) to store

and replicate data between multiple servers while using computing resources of the same servers to host virtual

machines or containers. In this work, the distributed ﬁle system GlusterFS and the hypervisor VMware ESXi

are used to build an hyperconverged system to host Docker containers, with the goal of evaluate the storage

performance of this system compared to traditional approach where data is stored directly on the server’s

disks. The performance of the container’s persistent storage is evaluated using the benchmark tool Yahoo

Cloud Service Benchmark (YCSB) against the NoSQL databases Cassandra and MongoDB under differents

workloads. The NoSQL database’s performance was compared between the hyperconverged system with

multiples disk conﬁgurations and a traditional system with local storage.

1 INTRODUCTION

In the last years, the amount of data generated by dif-

ferent devices and users is growing constantly and ex-

ponentially. This brings a challengue to the traditional

storage solutions that need to cope with massive data

storage and processing. For this, in the last years sev-

eral large-scale Distributed File Systems (DFS) were

developed to overcome these limitations. Some ben-

eﬁts of DFS are scalability, parallelism, the ability

to run on commodity hardware and fault-tolerance.

These systems assemble many nodes across a net-

work and provide a distributed ﬁle system with large

storage capacity, where data can be transparently ac-

cessed (Roch et al., 2018). The ability to use com-

modity hardware makes scalability economically vi-

able, allowing the addition of more devices to scale

up the system in an incremental fashion. Because the

ﬁle system runs on several commodity hardware com-

ponents, which are highly prone to failure, techniques

such as replication and erasure codes are used to avoid

data loss and increase fault-tolerance in case of hard-

ware failures.

Traditionally, computing and storage resources in

datacenters are separated into different pods. The

computing pod usually uses virtualization technolo-

gies to optimize resources, while the storage pod

is usually based on monolithic hardware. Recently,

hyperconvergence is an emerging paradigm that has

been gaining popularity, both in academia and indus-

try (Verma et al., 2017). This paradigm allows to

merge computing and storage components with hy-

pervisors and DFS solutions. While traditional stor-

age systems use dedicated hardware to storage and

compute, a hyperconverged system uses a DFS and a

hypervisor to aggregate the storage capacity and com-

puting resources, providing an unique pod of compute

and storage.

In a cloud computing environment, containers can

beneﬁt from a hyperconverged infrastructure. The

traditional container deployment uses local storage

on the node to start and run. Then, the ability to

store container’s persistent data on a hyperconverged

system may improve scalability, elasticity and fault-

tolerance in these environments.

Considering the above, the motivation of this work

is to evaluate storage performance of a proposed hy-

perconverged system, based on GlusterFS as the DFS

and VMware ESXi as the hypervisor. The goal is

to evaluate the performance and potential of con-

tainer’s storage I/O on an hyperconverged system,

by running inside Docker containers two applications

highly dependent on storage capabilities, the NoSQL

databases Cassandra and MongoDB (Abramova and

Bernardino, 2013), under several workloads of the

Yahoo Cloud Service Benchmark (YCSB) (Cooper

Leite, R., Solis, P. and Alchieri, E.

Performance Analysis of an Hyperconverged Infrastructure using Docker Containers and GlusterFS.

DOI: 10.5220/0007718003390346

In Proceedings of the 9th International Conference on Cloud Computing and Services Science (CLOSER 2019), pages 339-346

ISBN: 978-989-758-365-0

339

et al., 2010). The results of the proposed hypercon-

verged system are compared with a traditional conﬁg-

uration using local storage, with the intention of veri-

fying the viability of its use in place of direct-attached

storage.

This paper is organized as follows: Section 2

presents related works and the literature review; Sec-

tion 3 presents and describes the proposed architec-

ture; Section 4 details the experiments and results and

ﬁnally, Section 5 presents the conclusions and future

work of this research.

2 THEORETICAL CONCEPTS

AND RELATED WORK

Permanent storage consists of a named set of objects

that are explicit created, are immune to temporary

failures of the system, and persist until explicitly de-

stroyed. A ﬁle system is a reﬁnement that describes

a naming structure, a set of operations on objects de-

scribed by explicit characteristics. DFS allow multi-

ple users who are physically dispersed in a network of

autonomous computers share in the use of a common

ﬁle system. Cloud storage is a system that provides

data storage and assembles a large number of different

types of storage devices through the application soft-

ware which are based on the functions of cluster ap-

plications, grid techniques and DFS. Cloud storage is

a cloud computing system with large capacity storage.

An hyperconverged environment combines compute,

storage and networking components into a single sys-

tem or node based on commodity servers, with the

aim to reduce hardware costs when compared to spe-

cialized storage arrays, servers and other equipment,

which appears today as an interesting option to cloud

providers.

In the DFS area, recent studies evaluated the

common problem of how to store and handle large

amounts of data. For example, in (Roch et al., 2018)

there is a comparison between DFS for storing molec-

ular calculations of quantum chemistry. In this study

the DFS GlusterFS and Ceph were tested, using 8

nodes distributed in 8 different combinations. The ex-

periments were performed with 7 different patterns of

reading and writing. The conclusion was that Glus-

terFS outperforms Ceph during large I/O workloads

and that GlusterFS administration is simpliﬁed, re-

quiring less maintenance and installation time. The

conclusions about Ceph were that it has greater com-

plexity and needs separate servers for metadata (data

about other data), which increases the complexity of

the solution.

Another DFS study in a similar area investi-

gates the problem of how to store and transfer large

amounts of genetic data between two institutions

(Mills et al., 2018). This study shows that the best

way to transfer data between two research institu-

tions is to use a combination of a high-speed com-

munications network and a high-performance ﬁle sys-

tem. Finding a balance between these two items en-

ables data to be read from the source at high speed,

transmitted quickly, and then recorded at high speed

at the destination. The DFS analyzed in this study

were BeeGFS, Ceph, GlusterFS and OrangeFS. Ex-

periments were performed with 4 to 8 servers at the

source transmitting data via FTP to another 4 to 8

servers at the destination, with BeeFS getting the best

performance.

In the work of (Kaneko et al., 2016) a guideline

for data allocation in DFS is proposed, recommend-

ing that data stored in multiple nodes with few disks

have a better performance than a few nodes with many

disks. In the experiments, volumes built with disks on

different servers were compared with volumes built

with several disks on the same server. The compari-

son was made analyzing the read and write through-

put while copying ﬁles to and from the volumes. In

the results it was veriﬁed that the proposed guideline

obtained better performance in larger multiplicities.

The multiplicity is deﬁned by the author as being the

number of clients accessing simultaneously a num-

ber of volumes. For example, multiplicity 24 means

4 clients simultaneously accessing 6 volumes. The

analysis concluded that distributing the volume across

several servers results in improved data throughput

because of the parallelism in data access. This guide-

line can be considered a good practice for building

DFS. Building volumes with few server the advan-

tages of a parallel ﬁle system are lost and server re-

sources (processor, disk controller, network card, etc.)

can impact the performance. Spreading a volume

across multiple servers allowed them to be accessed

by different servers, thinning resource consumption

between each server and improving data throughput.

Studies on performance of hyperconvergence en-

vironments are still scarce. The fundamental concept

behind this architecture is based on the Beowulf Sys-

tems, that already been vastly studied by academia.

In a Beowulf system a cluster of commodity-grade

computers are networked to run some software that

can perform parallel computing and storage. A hy-

perconverged system is a recent evolution of this con-

cept, mashing virtualized servers and virtualized stor-

age onto the same clusters with the goal of simplify-

ing the infrastructure.

In one of the ﬁrst studies speciﬁcally about hy-

perconvergence, the General Parallel File System

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

340

(GPFS) is presented as a software deﬁned storage so-

lution for hyperconverged systems (Azagury et al.,

2014). In this work, several concepts of hypercon-

vergence and software deﬁned storage are outlined,

and the proposed solution is to use GPFS within a hy-

perconverged architecture using virtual storage appli-

ances (VSA) that manage physical storage resources.

The VSA allow the increase of capacity in a scale-out

model and data protection through the “GPFS native

RAID” (GNR) that makes copies of the data in several

nodes.

Recent studies on containers storage performance

analyze the possible conﬁgurations for storage ac-

cess (Tarasov et al., 2017) (Xu et al., 2017a), how

to provide encryption in image manipulation (Gian-

nakopoulos et al., 2017) and performance using solid

state disks (SSD) (Xu et al., 2017b) (Bhimani et al.,

2016). In (Tarasov et al., 2017) and (Xu et al., 2017a),

the mechanisms for containers access storage areas

are described and analyzed, but both are limited to

local storage, as well as in (Giannakopoulos et al.,

2017) (Xu et al., 2017a). In (Xu et al., 2017b) the

authors analyze the use of remote storage for contain-

ers, using non-volatile memory express over fabrics

(NVMF) technology being accessed by a remote di-

rect access memory (RDMA) over a 40 Gbps ethernet

network. This study is closer to our proposal, since

the container’s data will be stored in a DFS.

The amount of data generated by IoT devices

is addressed in another study, in witch the storage

paradigms for store and process data from IoT de-

vices are analyzed. The conclusion points to the hy-

perconverged architecture as one solution because of

its scalability, fault-tolerance and use of COTS hard-

ware (Verma et al., 2017).

3 PROPOSED ARCHITECTURE

The proposed architecture is presented in Figure 1 and

works as follows:

• The server runs a Type I Hypervisor with at least

two disk controllers: one for hypervisor, VSA and

container host 1 ; and one for the hyperconverged

system 2 .

• On the disks connected to the disk controller man-

aged by the hypervisor on path 3 are stored the

hypervisor itself and both VSA and the Container

Host virtual machines, as can be seen on path 5 .

• Using the “PCI Passthrough” feature on the hy-

pervisor, the second disk controller 2 is directly

managed by the VSA virtual machine via path 4 .

Figure 1: Hyperconverged Server Architecture.

• Each VSA, with its own storage resources, is con-

ﬁgured to participate in an existing cluster or start

a new one. The nodes provide their individual re-

sources 6 to create a distributed virtual volume

across the nodes of the cluster, as presented on

Figure 2. Data replication between VSA nodes

is performed over a low-latency, high-bandwidth

Ethernet network.

• After the virtual volume is created, it can be ac-

cessed by the container engine as a scalable and

fault-tolerant data storage unit, as can be seen on

path 7 .

The use of a disk controller dedicated to the VSA

is a desirable characteristic because it allows direct

control of the disks, dropping an abstraction layer in

the hypervisor. Another desirable requirement is a

high bandwidth and low latency network between the

servers to avoid congestion or any kind of degradation

during storage operations. The use of disks with same

model to build a volume is another important aspect

to get a better performance (Kaneko et al., 2016).

Figure 2: Virtual Volume build by the VSAs.

3.1 Implementation

In the next subsections, we describe the tools that

were integrated to implement the proposed arquitec-

Performance Analysis of an Hyperconverged Infrastructure using Docker Containers and GlusterFS

341

ture.

3.1.1 GlusterFS

GlusterFS is an open-source DFS that allows to dis-

tribute a large set of data across multiple servers. It

has simple operation, being supported by most Linux

distributions. It does not separate metadata/data and

therefore does not need additional servers for this task

(Fattahi and Azmi, 2017). It has a native mechanism

of redundancy based on the replication of ﬁles be-

tween a set of “bricks” (a basic unit of storage) that

form a resilient logical unit (Roch et al., 2018). It al-

lows the use of heterogeneous nodes, enabling nodes

with different conﬁgurations on the same distributed

volume. In these cases, it is recommended to group

nodes with the same characteristics for better perfor-

mance (Kaneko et al., 2016).

Bricks are the basic storage unit in GlusterFS, rep-

resented by an export directory on a server. A collec-

tion of blocks forms a “Volume” in which GlusterFS

operations take place. GlusterFS allows you to group

Bricks to form ﬁve different types of volumes:

• Distributed Volume: Distribute ﬁles between

bricks. The ﬁnal volume capacity is the sum of

bricks’ capacity. It has no redundancy. Failure on

a brick corrupts the volume. Recommended for

cases where availability is not important or is pro-

vided by another mechanism.

• Replicated Volume: The ﬁles are replicated be-

tween the volume bricks. Recommended for

cases where high availability and reliability are re-

quired.

• Distributed Replicated Volume: Distributes ﬁles

between replicated subvolumes. Recommended

for cases where scalability, high availability and

good reading performance are required.

• Dispersed Volume: They are based on erasure

codes and provide efﬁcient protection against disk

or server failures. A coded fragment of the orig-

inal ﬁle is stored in each brick, so that from a set

of fragments it is possible to retrieve the original

ﬁle.

• Distributed Dispersed Volume: Distributes ﬁles

between dispersed subvolumes. It has the advan-

tages of a distributed volume, but using dispersed

storage within the subvolume.

3.1.2 ESXi

VMware ESXi is a hypervisor that runs directly on

the host hardware, i.e., it is a virtual machine monitor

(VMM) type I (bare metal). It takes minimal mem-

ory from the physical server and provides a robust,

high-performance virtualization layer that abstracts

the server’s hardware resources and enables the shar-

ing of these resources across multiple virtual servers.

3.1.3 Docker

Docker is a application that performs operating-

system-level virtualization also known as container-

ization. It uses the resource isolation features of

kernel to allow independent containers to run within

a single host, avoiding the overhead of starting

and maintaining virtual machines (VMs) (Xu et al.,

2017a). Containers and virtual machines have similar

resource isolation and allocation beneﬁts but different

architectural approaches, which allows containers to

be more portable and efﬁcient compared to bare metal

and virtual machines (Bhimani et al., 2016).

3.1.4 Virtual Storage Appliance

The core component of our proposed architecture is

the Virtual Storage Appliance (VSA), a virtual ma-

chine that runs a minimal version of the Linux op-

erating system and the GlusterFS software. Glus-

terFS is conﬁgured to manage the physical storage re-

sources presented to the VSA and use these to build

a DFS. The choice of GlusterFS for the DFS is due

to the fact that data and metadata are stored together

in the nodes, without a speciﬁc node to handle meta-

data (Fattahi and Azmi, 2017). This feature makes

GlusterFS ideal for use in a hyperconverged archi-

tecture since the nodes are independent and can run

without external dependencies, simplifyng the archi-

tecture and allowing the deployment of just one VSA

per hypervisor and nothing else. The VSA nodes,

each with its own storage resources, allow the cre-

ation of DFS volumes that span across many server

and can be used by hypervisor and container engine

to store data with scalability and fault-tolerance.

4 EXPERIMENTAL ANALYSIS

4.1 Testbed

In our experiments we built a hyperconverged system

using three IBM HS23 servers with dual Intel Xeon

CPU E5-2670 2.60GHz CPUs, 96GB of RAM, one

16GB SSD (to host ESXi, VSA and Docker Host),

two 1TB HDD (to the hyperconverged system), hy-

pervisor ESXi version 6.5 and network connection via

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

342

two gigabit Ethernet ports. The VSA on each hyper-

visor has 4 vCPU, 8 GB of RAM, runs CentOS Linux

release 7.5.1804, GlusterFS version 3.12.15 and XFS

as the ﬁle system for disks managed by Gluster. The

Container Host has 12 vCPUs, 80 GB of RAM, runs

CentOS Linux release 7.5.1804 and Docker 18.09.

The connection between the Docker Host and the vir-

tual volume built by the VSAs is made through Glus-

ter Native Client. To evaluate the storage I/O per-

formance of the hyperconverged system we choose

the NoSQL databases Cassandra (version 3.11.3) and

MongoDB (version 4.0.3) running in Docker contain-

ers and storing persistent data on the hyperconverged

virtual volume.

To compare results from the hyperconverged sys-

tem we built a traditional system where storage re-

sources are provided by local disks on the server. For

this system we used another IBM HS23 server with

same conﬁguration as the hyperconverged servers,

but the persistent storage for the containized NoSQL

databases were provided by the hypervisor using the

local server’s disk.

Since the objective is to analyze storage perfor-

mance, the memory cache feature on both NoSQL

databases were disabled, meaning that all database

operations had to hit the storage volume and conse-

quently the underlying disks.

4.2 Workload

To evaluate our proposal we used the Yahoo Cloud

Serving Benchmark (YCSB) framework (Cooper

et al., 2010), a tool built do load, run and measure per-

formance of different NoSQL databases. After load-

ing data on the desired database, there are six core

workloads that can be run to evaluate database’s per-

formance. These core workloads are named A, B, C,

D, E and F, each with different I/O patterns. Besides

those, YCSB also allows using custom workloads,

which we used to test three more workloads: two

from Microsoft’s datacenters (Huang et al., 2017) and

one developed by us, called “Heavy Update”, with a

heavy write load. The 9 different workloads used to

evaluate our propose is summarized in Table 1.

Table 1: Application workloads used for evaluation.

Workload I/O Pattern

YCSB-A 50% read, 50% update

YCSB-B 95% read, 5% update

YCSB-C 100% read

YCSB-D 95% read, 5% insert

YCSB-E 95% scan, 5% insert

YCSB-F 50% read, 50% read-modify-write

Microsoft Cloud Storage 26.2% read, 73.8% update

Microsoft Web Search 83% read, 17% update

Heavy Update 100% update

4.3 Metrics, Validation and Factors

The metric evaluated during the execution of the ex-

periments is the operations per second (ops/sec), i.e.,

how many read, update or insert operations can be

submitted to the NoSQL database every second.

The validation of the results is performed evaluat-

ing the results from the proposed hyperconverged sys-

tem in comparison with the results from a traditional

system where data is stored on the local server’s disks.

For our experiments, the factors we chose to analyze

were: type of GlusterFS volumes; type of NoSQL

database; and number of threads.

4.4 Evaluation Scenarios

In the hyperconverged system built for this study,

GlusterFS can be conﬁgured to create ﬁve types of

volumes: Distributed, Replicated, Distributed Repli-

cated, Dispersed and Distributed Dispersed. Since

the Distributed volume does not offer fault-tolerance

and the Replicated volume does not scale, we did not

used these two types of disk conﬁgurations. The other

three types of volume offer equal fault-tolerance pro-

tection, but different types of performance. So we

tested all those three types of volumes with all work-

loads.

To evaluate the hyperconverged storage I/O per-

formance we used the NoSQL databases Cassandra

and MongoDB, each one submitted to the same work-

loads and storing persistent data on the same types of

volumes. The choice for these two is due to the fact

that databases are applications that depend on the stor-

age performance where data is stored and retrieved,

with Cassantra and MongoDB being popular among

recent studies in both academia and industry.

Both NoSQL databases were loaded with

1.000.000 records of 1KB using the YSCB tool.

Then the databases were submitted to the workloads

with different threads numbers (ranging from 1 to

100, in steps of 10). In each thread test, 100.000

transactions were made. The mean resulted of the

amount of transactions made in each second were

calculated with a 95% conﬁdence interval.

4.5 Experiments

• Experiment 1 - YCSB Core Workloads. This

experiment was conducted using the six core

workloads available in the YCSB framework

against Cassandra e MongoDB containerized in-

stances. The results can be seen on Figure 3.

• Experiment 2 - Microsoft Workloads. This ex-

periment was conducted using two Microsoft’s

Performance Analysis of an Hyperconverged Infrastructure using Docker Containers and GlusterFS

343

Figure 3: YCSB Workloads Results.

datacenters workloads. The results can be seen

on Figure 4.

• Experiment 3 - Heavy Update Workload. To

analyze the performance of write I/O storage op-

eration, we this experiment with only updates on

the database. The results can be seen on Figure 5.

4.6 Analysis of Results

The experimental results shown on Figures 3, 4 and 5

indicates that MongoDB performed better than Cas-

sandra in all experiments. MongoDB running on

the hyperconverged system presented a performance

equivalent to that of a traditional system, independent

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

344

Figure 4: Microsoft Workloads Results.

Figure 5: Heavy Update Workloads Results.

of the number of threads. In some speciﬁc cases we

archived a 7% better performance on the hypercon-

verged system than on local storage. We consider this

a good result because, even with the abstractions in

storage layer imposed by the hyperconverged system,

the performance was similar to the traditional system

where data is saved in a direct-attached storage.

Cassandra performance on the hyperconverged

system was below the expectations on almost all

workloads. The only workload where our proposal

performed better than direct-attached storage was in

Heavy Update workload above 70 concurrent threads,

where the hyperconverged system had 56% better per-

formance than on local storage.

The difference in performance results between

Cassandra and MongoDB can be explained by the

way these NoSQL databases store their data in the

ﬁle system. MongoDB stores data in large ﬁles, while

Cassandra does it in several small ﬁles. Storage per-

formance differences between using large and small

ﬁles is already known in the literature due to overhead

imposed by the DFS (Roch et al., 2018).

We also note that the performance of Cassandra

is changed by the disk conﬁguration used in Glus-

terFS. Signiﬁcant differences in throughput value for

the same workload and number of threads can been

observed in the results obtained with the YCSB-B, C

and D workloads on Figure 3. This behavior indicates

that workloads operating with small ﬁles in GlusterFS

have different performance depending on the volume

disk conﬁguration used.

It should be mentioned that during the experi-

ments no network congestion, high CPU usage or

high memory usage were observed, meaning that

these parameters did not interfered on the experi-

ment’s results.

5 CONCLUSIONS AND FUTURE

WORK

This paper presented a storage performance analysis

on a hyperconverged infrastructure that uses an open-

source DFS to store and replicate data between multi-

ple servers while using the computing resources of the

same servers to host containers and store its persistent

data. The proposal was evaluated under several differ-

ent scenarios, using 2 NoSQL databases, 9 workloads,

3 disk conﬁgurations and 11 different number of con-

current sessions (threads), compared to a traditional

direct-attached storage infrastructure.

The results show that applications working with

large ﬁles in the hyperconverged system performs

similarly to traditional local storage. But a hypercon-

verged system adds desired features for cloud com-

puting systems when compared to conventional stor-

age, like scalability, elasticity and fault-tolerance. A

hyperconverged system can grow easily by adding

more servers, which may have different computing

Performance Analysis of an Hyperconverged Infrastructure using Docker Containers and GlusterFS

345

and storage conﬁgurations. Meanwhile in a tradi-

tional architecture, new stand-alone servers can be

added to the infrastructure, but computing and storage

resources are conﬁned to each server. Besides that, a

small hyperconverged system can easily bypass the

resources of a traditional local storage-based system.

We believe that the use of our proposal of hyper-

converged infrastructure is feasible for cloud service

providers in the provision of IaaS services (e.g. vi-

tual machines and containers) and PaaS services (e.g.

containerised databases). An user’s application that

requires more storage resources than a single server

can provide, has challenges for scaling in a non-

convergent system but would have the resources al-

located easily in a hyperconverged architecture. This

type of feature is desired by datacenters from cloud

providers because it optimizes infrastructure by al-

lowing two resource pods (computing and storage) to

collapse into one, increasing efﬁciency.

The network requirements for the servers of a hy-

perconverged infrastructure must be properly sized to

minimize the possibility of congestion. The through-

put between the server and its disks should occur

without limitations imposed by throughput of the net-

work. As the scale of the hyperconverged system built

for the experiments was small, the network infrastruc-

ture did not interfere with the results. However if the

experiment’s testbed scale were larger, then network

resources would need to be increased.

In further research, we intend to evaluate our pro-

posal with bigger clusters, other hypervisor solutions,

different type of disks (e.g., SSD), server conﬁgura-

tions and workloads.

REFERENCES

Abramova, V. and Bernardino, J. (2013). Nosql databases:

Mongodb vs cassandra. In Proceedings of the inter-

national C* conference on computer science and soft-

ware engineering, pages 14–22. ACM.

Azagury, A. C., Haas, R., Hildebrand, D., Hunter, S. W.,

Neville, T., Oehme, S., and Shaikh, A. (2014). Gpfs-

based implementation of a hyperconverged system for

software deﬁned infrastructure. IBM Journal of Re-

search and Development, 58(2/3):6–1.

Bhimani, J., Yang, J., Yang, Z., Mi, N., Xu, Q., Awasthi, M.,

Pandurangan, R., and Balakrishnan, V. (2016). Under-

standing performance of i/o intensive containerized

applications for nvme ssds. In Performance Comput-

ing and Communications Conference (IPCCC), 2016

IEEE 35th International, pages 1–8. IEEE.

Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan,

R., and Sears, R. (2010). Benchmarking cloud serv-

ing systems with ycsb. In Proceedings of the 1st

ACM symposium on Cloud computing, pages 143–

154. ACM.

Fattahi, T. and Azmi, R. (2017). A new approach for di-

rectory management in glusterfs. In Information and

Knowledge Technology (IKT), 2017 9th International

Conference on, pages 166–174. IEEE.

Giannakopoulos, I., Papazafeiropoulos, K., Doka, K., and

Koziris, N. (2017). Isolation in docker through

layer encryption. In Distributed Computing Systems

(ICDCS), 2017 IEEE 37th International Conference

on, pages 2529–2532. IEEE.

Huang, J., Badam, A., Caulﬁeld, L., Nath, S., Sengupta,

S., Sharma, B., and Qureshi, M. K. (2017). Flash-

blox: Achieving both performance isolation and uni-

form lifetime for virtualized ssds. In FAST, pages

375–390.

Kaneko, S., Nakamura, T., Kamei, H., and Muraoka, H.

(2016). A guideline for data placement in heteroge-

neous distributed storage systems. In Advanced Ap-

plied Informatics (IIAI-AAI), 2016 5th IIAI Interna-

tional Congress on, pages 942–945. IEEE.

Mills, N., Feltus, F. A., and Ligon III, W. B. (2018). Max-

imizing the performance of scientiﬁc data transfer by

optimizing the interface between parallel ﬁle systems

and advanced research networks. Future Generation

Computer Systems, 79:190–198.

Roch, L. M., Aleksiev, T., Murri, R., and Baldridge, K. K.

(2018). Performance analysis of open-source dis-

tributed ﬁle systems for practical large-scale molec-

ular ab initio, density functional theory, and gw+

bse calculations. International Journal of Quantum

Chemistry, 118(1).

Tarasov, V., Rupprecht, L., Skourtis, D., Warke, A., Hilde-

brand, D., Mohamed, M., Mandagere, N., Li, W.,

Rangaswami, R., and Zhao, M. (2017). In search

of the ideal storage conﬁguration for docker contain-

ers. In Foundations and Applications of Self* Systems

(FAS* W), 2017 IEEE 2nd International Workshops

on, pages 199–206. IEEE.

Verma, S., Kawamoto, Y., Fadlullah, Z. M., Nishiyama, H.,

and Kato, N. (2017). A survey on network method-

ologies for real-time analytics of massive iot data and

open research issues. IEEE Communications Surveys

& Tutorials, 19(3):1457–1477.

Xu, Q., Awasthi, M., Malladi, K. T., Bhimani, J., Yang, J.,

and Annavaram, M. (2017a). Docker characterization

on high performance ssds. In Performance Analysis

of Systems and Software (ISPASS), 2017 IEEE Inter-

national Symposium on, pages 133–134. IEEE.

Xu, Q., Awasthi, M., Malladi, K. T., Bhimani, J., Yang, J.,

and Annavaram, M. (2017b). Performance analysis of

containerized applications on local and remote stor-

age. In Proc. of MSST.

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

346