Performance and Cost Analysis Between On-Demand and Preemptive

Virtual Machines

Breno G. S. Costa, Marco Antonio Sousa Reis, Alet

eia P. F. Ara

ujo and Priscila Solis

Department of Computer Science, University of Brasilia, Bras

ılia/DF, Brazil

Keywords:

Cloud Computing, Transient Server, Performance Analysis, Preemptive Machines.

Abstract:

A few years ago, Amazon Web Services introduced spot instances, transient servers that can be contracted

at a signiﬁcant discount over regular price, but whose availability depends on cloud provider criteria and the

instance can be revoked at any time. Google Cloud Platform offers preemptive instances, transient servers that

have similar behavior and discount level to spot instances. Both providers advertise that their transient servers

have the same performance level as servers contracted on-demand. Even with the possibility of revocation

at the provider’s discretion, some applications can beneﬁt from the low prices charged by these servers. But

the measured performance of both models, transient and on-demand, must be similar, and the applications

must survive occasional or mass server revoking. This work compares the performance and costs of transient

and on-demand servers from both providers. Results show there is no signiﬁcant difference in performance

measured, but there is real cost advantage using transient servers. On Amazon Web Services a MapReduce

cluster composed of transient servers achieved a 68% discount when compared to the same cluster based on

on-demand servers. On Google Cloud Platform, the discount achieved was 26% but it can be bigger when the

clusters are larger.

1 INTRODUCTION

Cloud providers have introduced a new class of

servers, called transient servers, which they can uni-

laterally revoke at any time (Singh et al., 2014).

Transient servers increase the utilization of a cloud

provider’s infrastructure while enabling it to retrieve

resources at any time to lease them to higher priority

users.

Due to their preemptive nature, transient servers

are not suitable for running interactive systems such

as web services, or any system that does not toler-

ate downtime caused by server revocations. Cloud

providers typically provide a brief early warning be-

fore revoking a transient server to allow the customer

to shut it down properly. Batch-oriented interrupt-

tolerant applications are particularly suitable for tran-

sient servers, as they can tolerate longer completion

times caused by occasional inactivity. A common sce-

nario is to use tens or hundreds of transient servers to

run highly CPU-intensive or data-intensive systems at

lower costs (compared to regular server prices con-

tracted on-demand).

Different cloud providers have different pricing

models for transient servers. The Google Cloud Plat-

form (GCP) transient servers, named preemptive in-

stances (Google, 2017), have a ﬁxed discount of about

80%, a maximum lifetime of 24 hours (with the possi-

bility of preemption within lifetime) and with an alert

of revocation of only 30 seconds.

Microsoft Azure recently announced that its tran-

sient server offering, called Low-priority virtual ma-

chine (VM), changed status from public preview to

general availability. These VMs can only be used

within a speciﬁc service called Batch (Microsoft,

2017). Their pricing structure is similar to GCP, with

a ﬁxed price and a discount of up to 80%. The Batch

service re-queues a task when a low-priority VM that

is executing the task is revoked. According to the an-

nouncement, preview pricing will be in effect for a

few months, but it will move to regional pricing —

the same as on-demand VMs — and this might cause

a slight increase in pricing, depending on the region.

In contrast, Amazon Web Services (AWS) spot in-

stances (SI) (AWS, 2017c) offer a variable discount.

The price of SIs varies continuously based on mar-

ket supply and demand for each type of server. The

customer speciﬁes a maximum price (a bid) that he

is willing to pay when ordering SIs. AWS, based on

the proposals submitted and following market criteria,

G. S. Costa, B., Reis, M., P. F. Araújo, A. and Solis, P.

Performance and Cost Analysis Between On-Demand and Preemptive Virtual Machines.

DOI: 10.5220/0006709001690178

In Proceedings of the 8th International Conference on Cloud Computing and Services Science (CLOSER 2018), pages 169-178

ISBN: 978-989-758-295-0

169

determines a market price for the server (Agmon Ben-

Yehuda et al., 2013). If the market price increases and

stays above the bid, the server will be revoked, but

only after the two-minute notice period. While the SI

price remains below the bid, the SI remains available

and the customer will pay only the market price, even

if his bid is higher.

Figure 1 shows the market price variation, in a

one month period, of an M4.2xlarge type SI on the

us-east-1e zone. Each availability zone is a different

market and prices can be different. In the time frame,

the SI price reached the on-demand price (US$ 0.40)

less than 10 times and for short periods.

Due to the high probability of changing prices and

even the behavior of their transient servers in the short

term, Microsoft Azure Low-priority VMs will be ex-

cluded from this analysis, but they will be included as

future work as soon as the offer becomes stabilized.

1.1 Availability of Transient Servers

The availability of transient servers (in terms of aver-

age revocation time) can also vary signiﬁcantly across

server conﬁgurations and on the basis of changing

market conditions. Unfortunately, cloud platforms do

not directly expose the availability statistics of tran-

sient servers, requiring users to infer them indirectly,

for example, through price history. Thus, it is chal-

lenging for a cloud system to select the most appro-

priate server conﬁguration based on historical price

or availability data to meet its needs. Recent research

suggests that mitigating the risk of revocation requires

a parallelized system to diversify its resource needs

across various types of transient servers, further com-

plicating decision making (Sharma et al., 2016).

The problem is exacerbated by the large number

of transient server choices available from providers:

there are over 2500 SI options in AWS Elastic Cloud

Computing (EC2) and more than 300 GCP preemp-

tive instances. This is because each availability zone

has its own market value calculation for each avail-

able virtual machine conﬁguration.

According to (Sharma et al., 2017), choosing a

server conﬁguration based only on price can pro-

duce sub-optimal results. The authors cite an exam-

ple where server conﬁgurations with very low prices

can also see greater market demand and consequently

higher price volatility and more revocations. Frequent

revocations generate additional veriﬁcation, check-

pointing, and system recovery efforts. Instead, they

suggest that choosing a slightly more expensive server

conﬁguration and having a lower revocation rate can

produce lower overall costs.

Due to the challenges listed, cloud providers such

as AWS have begun offering server selection tools.

Amazon SpotFleet (AWS, 2015a) automatically re-

places revoked servers. However, SpotFleet has a

limited choice in terms of the combinations of server

conﬁgurations that it offers and does not solve some

of the challenges presented. Another tool, Amazon

Spot Bid Advisor (AWS, 2015b), can help users se-

lect servers based on price, but exposes only super-

ﬁcial volatility information such as low, medium, or

high categorization.

An important consideration is that if transient

server performance was lower than the on-demand

server, and adding up the additional complexity of

dealing with revocations, the large discounts offered

by the providers would not be worthwhile. There-

fore, this study measures the performance of transient

servers using benchmarking software and compares

them to on-demand servers to verify whether the cost

decrease advertised by cloud providers is also accom-

panied by a performance decrease. In addition, a sce-

nario in which the use of a transient server is viable

will be implemented and the costs of execution in the

two server classes will be compared on both providers

that offer them.

The remainder of this article is divided into six

sections. Section 2 presents some related work. Sec-

tion 3 describes the experiments environment. Sec-

tion 4 describes the planning and the results of the ex-

periments. Section 5 performs an analysis of the per-

formance experiments. Section 6 presents a cost com-

parison of a MapReduce workload running on both

classes of servers, and in Section 7 conclusion and fu-

ture work are presented.

2 RELATED WORK

There are, in literature, several studies regarding spot

instances. Many of them try to predict SI prices and

ﬁnd an optimal bid on the the spot market. The strate-

gies undertaken by these researches are diverse. Time

series forecasting is used by (Chhetri et al., 2017),

whose results, using three speciﬁc metrics, show that

successful estimation of bid prices in AWS spot mar-

kets is an implicit function of seasonal components

and extreme spikes in the spot price history. Another

study (Khandelwal et al., 2017) uses Regression Ran-

dom Forests (RRFs) to predict spot prices. The au-

thors use a one year trace of spot market prices and

compare the results achieved by RRFs with existing

non-parametric machine learning models. The paper

reveal that RRF-based forecast accuracy outperforms

other models.

In (Wolski and Brevik, 2016) a method is pro-

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

170

Figure 1: AWS SI price history for a one-month period in 2017.

posed to determine the probabilistic availability assur-

ance for SIs: DrAFTS, acronym of Durability Agree-

ments From Time Series. Thus, a prediction algo-

rithm has been created for bid prices that will gain

a certain SI in the AWS market. The prototype also

determines the likelihood of how long these prices re-

main the same. The prediction, in this case, shows a

combination of the maximum price of an SI and the

time that this value guarantees the termination of the

VM, considering that the termination is caused by the

increase of bid prices.

To ensure predictability, DrAFTS (Wolski and

Brevik, 2016) runs regression tests with the price his-

tory of each instance type and stores the predictions.

Each time a new prediction is generated, the method

selects a random sample of prices in the history and

re-runs the DrAFTS algorithm. The fraction of cor-

rect predictions is reported with the probability of

success.

Other studies consider also the workload that is

running on the cloud to suggest a proper amount of

resources in order to guarantee its completion. In

(Huang et al., 2013), they proposed a tool that auto-

matically proﬁles the application, builds a model to

predict its performance, and infers a proper cluster

size that can ﬁnish the job within its deadline while

minimizing the total cost. Based on these parameters,

the tool also chooses between on-demand or spot in-

stances. In the work of (Sabyasachi et al., 2017), the

proposal is about a framework that allows users to bid

different prices depending on their perceived urgency

and nature of the running job. It allows them to nego-

tiate the current bid price in a way that guarantees the

timely completion of their jobs.

The work of (Chohan et al., 2010) uses SIs as

the acceleration mechanism for MapReduce applica-

tions executed in benchmarks presented in the arti-

cle. However, the unexpected termination of SIs can

have adverse effects on application execution time,

and could increase the ﬁnal cost. The research then

shows techniques that help mitigate these effects.

According to the authors of (Chohan et al., 2010),

an SI is suitable for batch processing of MapReduce

because of its fault tolerance characteristic. When a

VM becomes unavailable, the internal mechanism of

MapReduce automatically looks for another VM to

complete the task. With this, the research concluded

that the use of SIs accelerates processing and causes

the consequent cost decrease.

In the experiments of (Chohan et al., 2010) four

on-demand machines and one SI were used for ac-

celeration. At some loads this acceleration reaches

200%, while the cost is only increased by 42%. How-

ever, the experiments conducted showed that in cer-

tain cases, failures negatively impacted processing by

up to 27%, since SIs are less reliable than on-demand

VMs.

This work differs from the others by compar-

ing, in two cloud providers, the performance of on-

demand VMs and transient ones. CPU, Input and

Output (I/O), and network performance metrics will

be used to identify whether performance is compati-

Performance and Cost Analysis Between On-Demand and Preemptive Virtual Machines

171

ble, or whether the price decrease of the transient VM

also implies a decrease in performance. The chosen

providers were AWS and GCP, since these providers

offer both VM options and they are among the top

three public cloud providers in the Gartner Magic

Quadrant (Gartner, 2016). In addition to performance

benchmarking, the cost of an execution scenario will

be measured and compared for both VM classes from

both providers.

3 TESTBEDS

The purpose of the study is to compare the perfor-

mance of on-demand VMs with the transient versions

of these same VMs from a public cloud provider. The

on-demand version meets one of the cloud computing

features deﬁned by NIST (Mell et al., 2011), which

is the perception that there is an inﬁnite supply of re-

sources. That is, it will always be possible to get ad-

ditional on-demand resources at any time, from the

point of view of a single customer.

The transient VMs, on the other hand, have di-

verse availability. If there are no idle resources in a

given provider’s availability zone, one may not get

transient VMs while this scenario remains. And even

when resources are available and they are in use by a

customer, the provider can revoke them at their dis-

cretion. The counterpart to this decrease in availabil-

ity is the cost, which is much lower in the model with

possibility of preemption, and can reach up to 90%

discount in relation to the on-demand price (AWS,

2017c).

We analyze the performance relationship between

these two models of infrastructure as a service and

verify if the lower price means, in addition to lower

availability, lower performance according to three

metrics: quantity of ﬂoating point operations per

second (GFLOPS) that each vCPU supports, I/O

throughput rate and network throughput. In addi-

tion, the costs of one workload that is transient-server-

friendly will be compared, being executed in the two

scenarios: one with the use of on-demand VMs and

other with the use of transient VMs.

In public cloud providers, different availability

zones represent data centers located in different ge-

ographic locations and possibly with a different in-

frastructure as well. For this study, before running

the benchmarking software, both classes of VMs were

initiated on the same availability zone and with the

same conﬁguration (cpu model, disk type).

Providers also offer different families of VMs and

some of them are specialized on a computational

function: processing, Input/Output (I/O), and net-

Table 1: AWS VM Conﬁguration.

Name

Qty

vCPUs

VM Type

Storage

(GB)

Zone

On-Dem SML 1 on-demand 50 sa-east-1a

On-Dem MED 2 on-demand 50 sa-east-1a

On-Dem BIG 4 on-demand 50 sa-east-1a

SPOT SML 1 transient 50 sa-east-1a

SPOT MED 2 transient 50 sa-east-1a

SPOT BIG 4 transient 50 sa-east-1a

Table 2: GCP VM Conﬁguration.

Name

Qty

vCPUs

VM Type

Storage

(GB)

Zone

On-Dem SML 1 on-demand 50 us-central1-a

On-Dem MED 2 on-demand 50 us-central1-a

On-Dem BIG 4 on-demand 50 us-central1-a

PREEM SML 1 transient 50 us-central1-a

PREEM MED 2 transient 50 us-central1-a

PREEM BIG 4 transient 50 us-central1-a

working, for instance. For the sake of performance

comparison, the choice was, on each provider, the

general-purpose family. This means all VMs used

on performance experiments have a balance between

the computational functions mentioned. On AWS, M-

family was the choice and on GCP, n1-standard fam-

ily. On cost experiments, the VM families that are ap-

propriate for the speciﬁc workload being tested were

used.

For comparison purposes, the experiment ran on

three VMs with different amounts of vCPUs, in or-

der to increase the evaluation sample and, therefore,

to achieve more accurate results. The most powerful

VM used in the experiment was the one with 4 vC-

PUs, referenced in the rest of this work as BIG. The

other VMs have lower amounts of vCPUs and will

be referenced as MED (for medium) and SML (for

small). For each size, an on-demand VM (referenced

as On-Dem on both providers) and a transient one

(referenced as SPOT on AWS and PREEM on GCP)

will be created, according to the tables 1 and 2. The

AWS zone where the VMs were created was sa-east-

1a and GCP availability zone used was us-central1-a.

For storage, Solid State Drive (SSD) options were

used. The providers allows the user to choose be-

tween SSD and Hard Disk Drive (HDD) based stor-

age. The SSD has higher throughput in terms of I/O

operations per second (IOPS) and data transfers, and

hence higher cost per GB of allocated space.

4 PLANNING

The study of (Coutinho et al., 2012) compared the

results of CPU performance measurements among

some available benchmark software and selected Lin-

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

172

Table 3: Experiment design.

SYSTEM Virtual Machine

METRICS

CPU (GFLOPs),

I/O (throughput, in MB/s),

NETWORK (throughput, in Mbps)

PARAMETERS

For CPU, number of Linpack equations,

array size, number of runs, and total

size of data. For I/O, the size of ﬁles

and records. For NETWORK, the

load time and the amount of runs.

FACTORS

Size of VMs in amount of

vCPUs (BIG, MED and SML)

TECHNIQUE Measurement

WORKLOAD

Synthetic. Workload of benchmarks

LINPACK (CPU), IOZONE (I/O)

and IPERF (NETWORK)

DESIGN

For each VM, the benchmarks

will be executed independently

DATA

ANALYSIS

Interpretation of the results

described in tables and graphs.

PRESENTATION

OF RESULTS

Bar charts and tables

pack (Intel, 2017). They ﬁnd compatible results be-

tween them when evaluating the same computing en-

vironment. Besides, Linpack uses less execution time

and has more simpliﬁed conﬁguration. Linpack was

also the benchmark selected for the study presented

here. Regarding I/O, the benchmark chosen was IO-

Zone (Iozone, 2017), which is also widely used in

such measurements in the literature. To evaluate net-

work performance, iPerf (Iperf, 2017) was the bench-

mark software chosen.

The parameters used on CPU benchmark software

were deﬁned by (Coutinho et al., 2012). They de-

ﬁned as Linpack parameters 10000 equations (prob-

lem size), 10000 as matrix size, 30 repetitions and

the data size as 10KB. For IOzone, which measures

throughput in various I/O operations, the default pa-

rameters were used. IOZone varies the size of the ﬁle

to be manipulated, as well as the data records that

compose these ﬁles, in 8 different operations. For

the purpose of this comparison, we will show the

results of writing and reading operations only. For

iPerf, which measures the network throughput, stan-

dard TCP protocol parameters were used.

The methodology for performing performance

analysis is described in (Jain, 1991). It is possible

to detail characteristics related to the design of the

experiments, such as metrics and workloads, besides

the factors and parameters. The details are described

in Table 3.

4.1 Experiment 1 - vCPU

The goal of this experiment is to compare VM per-

formance results between both classes (on-demand

Table 4: AWS VMs performance measurements

(GFLOPS).

GFLOPS

SPOT

BIG

SPOT

MED

SPOT

SML

On-Dem

BIG

On-Dem

MED

On-Dem

SML

Mean 145.43 73.53 35.709 145.87 72.51 35.99

Deviation 0.21 0.14 0.15 0.52 0.52 0.16

Minimum 144.45 72.99 35.10 143.69 69.97 35.42

Maximum 145.63 73.62 35.80 146.48 72.81 36.16

Table 5: GCP VMs performance measurements (GFLOPS).

GFLOPS

PREEM

BIG

PREEM

MED

PREEM

SML

On-Dem

BIG

On-Dem

MED

On-Dem

SML

Mean 125.70 60.13 31.82 125.95 65.14 32.39

Deviation 7.26 1.01 2.08 1.96 0.76 0.47

Minimum 111.68 57.63 22.16 117.84 62.39 30.45

Maximum 131.34 62.40 33.17 127.86 65.97 32.80

an transient) for each size. Only VMs of the same

provider will be compared to each other.

Tables 4 and 5 show the values measured for each

VM, as well as the standard deviation of the sam-

ple and the maximum and minimum values. It can

be seen that in all of them the standard deviation was

very small, which demonstrates stability in the deliv-

ery of the contracted vCPU resource in both contract-

ing models.

Figure 2 compares each AWS on-demand VM size

with the similar transient VM and it is seen that the

measurement is practically the same in all cases. The

results found for GCP VMs follow the same pattern,

as seen in Table 5.

Figure 2: AWS vCPU comparative performance.

4.2 Experiment 2 - I/O

The purpose of this experiment is to measure the I/O

throughput of all VMs and compare them.

For this experiment IOZone was used. IOzone is a

benchmark for ﬁle system. It generates and measures

a variety of read and write operations on ﬁles. The

tests used in this experiment were read, indicating the

performance of reading a ﬁle that already exists in the

ﬁle system, and write, indicating the performance of

writing a new ﬁle in the ﬁle system. These tests cre-

ate temporary test ﬁles of sizes ranging from 64KB to

Performance and Cost Analysis Between On-Demand and Preemptive Virtual Machines

173

Table 6: AWS I/O throughput, in MB/s, for write opera-

tions.

SPOT

BIG

SPOT

MED

SPOT

SML

On-Dem

BIG

On-Dem

MED

On-Dem

SML

Mean 2504 2867 2211 2751 2456 2177

Minimum 497 420 506 552 458 475

Maximum 3407 3947 3070 3868 3425 2993

Table 7: AWS I/O throughput, in MB/s, for read operations.

SPOT

BIG

SPOT

MED

SPOT

SML

On-Dem

BIG

On-Dem

MED

On-Dem

SML

Mean 5042 5515 4530 5265 4519 4248

Minimum 2031 2185 1339 2128 1882 1743

Maximum 11763 11956 11403 11185 10220 10119

512MB. The size of the records varies from 4KB to

16MB. All results are in MB/s.

Tables 6 and 7 show AWS throughput values mea-

sured by IOZone for each VM for write and read oper-

ations, respectively. By the analysis of Figures 3 and

4, it can be seen that the measurements are quite simi-

lar, with a slightly higher value of On-Dem BIG com-

pared to SPOT BIG, but with slightly lower values

for both On-Dem MED, as well as for On-Dem SML,

when compared to their equivalent SPOT. The result

seems to indicate that the variation was due to the I/O

load of the infrastructure of the sa-east-1a zone at the

time of measurement rather than to an actual differ-

ence between the analyzed VMs.

Figure 3: AWS I/O throughput, in MB/s, for write opera-

tions.

As can be seen in Tables 8 and 9, which show

the results measured for the GCP provider, there was

less than a 5% average I/O performance difference be-

tween the two types of VMs. The results also seem to

indicate an expected variation on I/O load of the us-

central1-a zone at the time of measurement.

4.3 Experiment 3 - Network

The purpose of this experiment is to measure the net-

work throughput between on-demand VMs and tran-

Figure 4: AWS I/O throughput, in MB/s, for read opera-

tions.

Table 8: GCP I/O throughput, in MB/s, for write operations.

PREEM

BIG

PREEM

MED

PREEM

SML

On-Dem

BIG

On-Dem

MED

On-Dem

SML

Mean 1892 1886 2180 1858 1916 2071

Minimum 373 303 373 84 406 422

Maximum 2797 3145 2966 2917 3256 2947

sient ones on both providers. For this experiment the

benchmark IPerf was used.

SML size VM was used as the server, while the

other two VMs were used as client machines within

the same zone.

The experiment consisted of generating a stream

of TCP data on each client machine for one minute

and measuring it second by second. This throughput

was executed 40 times at different hours of the day, in

order to record the throughput variation.

As there was a close similarity between the mea-

sured values of the two VMs that served as IPerf

clients, the values were consolidated and are pre-

sented in Tables 10 and 11 with reference only to BIG

VMs. As expected, there was a relevant variability

in the measurements, which is reﬂected in the ratio of

the deviations from the mean, as well as in the interval

between the maximum and minimum values.

The difference between the measurements was

quite small, as can be seen in Figure 5 for GCP. AWS

has similar results as can be seen in Table 10. In

the network measurement, the GCP VMs presented

a difference of less than 1% in the throughput, which

seems to show no actual difference between the VMs,

but only a regular variability found in this type of

measurement.

Table 9: GCP I/O throughput, in MB/s, for read operations.

PREEM

BIG

PREEM

MED

PREEM

SML

On-Dem

BIG

On-Dem

MED

On-Dem

SML

Mean 3337 3006 3545 3319 3353 3802

Minimum 1443 1123 1539 1183 825 1592

Maximum 12075 12427 12075 10159 11852 10894

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

174

Table 10: AWS Network throughput (in Mbps).

SPOT

BIG

On-Dem

BIG

Mean 303 313

Deviation 66 94

Minimum 276 296

Maximum 635 1,001

Table 11: GCP Network throughput (in Mbps).

PREEM

BIG

On-Dem

BIG

Mean 3614 3592

Deviation 545 543

Minimum 1410 430

Maximum 4960 5100

Figure 5: GCP Network throughput (in Mbps).

5 ANALYSIS

The experiments carried out showed that, in relation

to vCPU performance, the greatest measurement dif-

ference found on AWS was small, only 1.4% in fa-

vor of VM SPOT MED. On the other hand, the On-

Dem BIG and On-Dem SML VMs performed better

than their similar ones by only 0.3% and 0.8%, re-

spectively. On GCP, all on-demand VMs performed

better than the preemptive ones. But the differences

were tiny on average: 0.2%, 8.3% and 1.8% for BIG,

MED and SML VMs, respectively.

In relation to the I/O throughput, the greatest mea-

surement difference found on AWS was 18%, in the

reading operations, in favor of VM SPOT MED. VM

On-Dem BIG had a superior result of 4.4% and VM

SPOT SML performed better than its equivalent at

6.22%. On GCP side, the greatest difference was also

on reading operations. It was 11.5% in favor of VM

On-Dem MED. VM PREEM BIG had a 0.55% better

throughput and VM On-Dem SML did 7,24% better

than its preemptive equivalent.

Regarding the network throughput, the On-Dem

BIG performed 3.3% higher when compared to its

equivalent SPOT BIG on AWS. On GCP, PREEM

BIG performed 0,61% better than On-Dem BIG.

The small differences found in measurements, as

well as the fact that the VM size that presented the

best performance varied according to the experiment

and, in some cases, varied within the same experi-

ment, allows us to state that there is no signiﬁcant dif-

ference between on-demand and transient VMs from

both providers.

6 COSTS

In order to proceed to cost analysis, additional exper-

iments were performed to compare on-demand and

transient VMs from each provider.

6.1 Costs in AWS

The VMs available in AWS for use in cost experi-

ments are listed in Table 12. The AWS M family of

VMs is used for general purpose applications (AWS,

2017b), while the R family is used for memory inten-

sive applications.

Some VMs are available with Elastic Block Store

(EBS), an AWS service that adds a virtual disk to the

VMs for persistent data storage. Local SSD disks,

however, are removed when the VM is turned off.

EBS volumes preserve the data and can be copied as

a backup. They can be attached later to other VMs

(AWS, 2017a). VMs that use EBS are more robust

and suitable to perform intensive processing, espe-

cially Hadoop applications, which is the focus of the

additional experiments (AWS, 2017b).

Table 12: AWS instance types.

Type CPU Memory Storage

m3.large 2 7.5 1 x 32

m3.xlarge 4 15 2 x 40

m3.2xlarge 8 30 2 x 80

m4.large 2 8 EBS-only

m4.xlarge 4 16 EBS-only

m4.2xlarge 8 32 EBS-only

r4.large 2 15.25 EBS-Only

r4.xlarge 4 30.5 EBS-Only

r4.2xlarge 8 61 EBS-Only

The AWS prices are shown on Table 13. For

the additional experiments, EC2 instances running the

Performance and Cost Analysis Between On-Demand and Preemptive Virtual Machines

175

Elastic MapReduce (EMR) service were used. To cal-

culate the total amount to be paid per hour of use, the

columns for the hourly price of EC2 and EMR must

be added together. The values are in US dollars. AWS

us-east-1 region was used since it has the lowest price

for on-demand VMs and then provides a fair compar-

ison to SI prices.

Table 13: AWS prices for on-demand VMs (us-east-1).

Type EC2 EMR

(US$/hour) (US$/hour)

m3.xlarge 0.266 0.070

m3.2xlarge 0.532 0.140

m4.large 0.100 0.030

m4.xlarge 0.200 0.060

m4.2xlarge 0.400 0.120

m4.4xlarge 0.800 0.240

r4.xlarge 0.266 0.067

r4.2xlarge 0.532 0.133

This experiment consists of indexing a ﬁle of

26 GB of data in BZ2 (compressed) format using

MapReduce on a cluster composed of a master ma-

chine and two workers of VM type r4.xlarge, whose

family of VMs is more appropriate to execute EMR

clusters. This experiment was executed several times

and an average execution time was calculated for both

SI and on-demand EMR clusters.

In the SI mode, 3 clusters of the same conﬁg-

uration (3 VMs r4.xlarge) were created and named

HDP3, HDP4 and HDP5, according to the values used

as the SI bid: US$ 0.03, US$ 0.04 and US$ 0.05, re-

spectively. The SI market value for each zone of re-

gion us-east-1 for this VM type is shown in Table 14.

Table 14: Market value of r4.xlarge SPOT on region us-

east-1

Zone Value (US$)

us-east-1a 0.047

us-east-1b 0.033

us-east-1c 0.039

us-east-1d 0.035

us-east-1e 0.032

us-east-1f 0.027

Clusters HDP4 and HDP5 were provisioned in

300 seconds, which is the average time veriﬁed in

AWS for this type of VM. According to Table 14,

the bid values were all above the market value for

r4.xlarge in zone us-east-1f. However, HDP3 was not

provisioned.

There is a large difference between the values

of on-demand VMs and SIs. While an on-demand

r4.xlarge costs US$ 0.266 per hour, an SI r4.xlarge

costs between US$ 0.027 and US$ 0.047 at that time

in the us-east-1 region. However, as veriﬁed in the

experiment, there is no guarantee of provisioning, es-

pecially in cases where the bid value is close to the

minimum value in the SI market.

Regarding performance, the on-demand and SI

machines are similar, as seen in Section 5. The ex-

ecution time of an indexing MapReduce application

with the same dataset was, on average, between 4 and

5 hours in all EMR clusters created. The result can

be veriﬁed in Table 15, with the conﬁrmation that the

performance does not suffer degradation during exe-

cution in SI, unless one or more instances are revoked

during the execution.

HDP4 execution time was slightly lower than the

on-demand EMR cluster execution time. HDP5 exe-

cution time was higher, in the opposite direction. This

performance loss for HDP5 probably came from SI

revocations and replenishment.

Table 15: AWS EMR cluster average execution time.

EMR cluster Time

On-demand 4 hours e 16 minutes

HDP4 4 hours e 15 minutes

HDP5 4 hours e 52 minutes

6.2 Costs in GCP

The experiments in GCP were conducted using VMs

of type ‘n1-highmem-4’, each of them with 200 GB

disk and 26 GB of RAM. The datasets had 6 GB of

data in BZ2 (compressed) format, and were processed

using MapReduce.

GCP has a service called Dataproc, where one can

create Hadoop clusters. On GCP, a Hadoop cluster

should have at least the master node and 2 workers

of on-demand VMs. Because of this, the cost experi-

ment in this provider was a little bit different. That is,

on AWS it is possible to create a cluster totally com-

posed of transient servers, but on GCP it is possible

only to create a partially composed one.

The GCP prices are showed on the table 16. On

GCP there is no price variation for preemptive VMs.

Similar to AWS, there is a ﬁxed price for Dataproc

service that must be added to VM execution price to

achieve the total cost of cluster execution. The values

are in US dollars and refer to region us-east1.

Table 16: GCP hourly prices (US$) for n1-highmem-4 VM.

Type On-dem Preem Dataproc

n1-highmem-4 0.2696 0.0535 0.04

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

176

Two clusters were created on GCP. The ﬁrst one

was composed of 1 master node and 4 workers, all of

them on-demand. The second cluster was composed

of 1 master node, 2 on-demand workers and 2 more

preemptive workers. This cluster is therefore partially

preemptive.

As seen in Table 17, the performance in hybrid

clusters (preemptive and on-demand VMs) is similar

to cluster with only on-demand VMs.

The partially preemptive cluster executed 7%

slower on average, with a cost 26% smaller than the

totally on-demand one. Considering that each pre-

emptive VM has an 80% discount, that Dataproc ser-

vice has a ﬁxed cost and that the cluster is composed

of at least 3 on-demand VMs, one can reach a bigger

discount when executing larger clusters with only 3

mandatory on-demand VMs and the rest of them as

preemptive VMs. For example, a cluster made of 3

on-demand VMs plus 7 preemptive VMs, executing

the same workload described in this subsection (6.2)

will reach a discount level of 52%.

Table 17: GCP cluster average execution time (seconds).

Partially Preem On-Demand

Mean 1330.00 1241.75

Deviation 32.11 32.01

Minimum 1289.00 1217.00

Maximum 1358.00 1285.00

7 CONCLUSION

This work compared the performance of on-demand

VMs with equivalent preemptive VMs (same type,

same conﬁguration, running in the same zone). Met-

rics related to CPU, I/O and network were used, thus

comparing execution costs in speciﬁc scenarios on

each provider.

The results indicated that there were no signiﬁcant

performance differences in the metrics used for CPU,

I/O and network. These results conﬁrm that both VMs

use the same computational resource set. This is ad-

ditionally veriﬁed by a series of MapReduce appli-

cation executions. The differences between the two

modes are availability and cost. The availability of

on-demand VMs seems unlimited, since in the vast

majority of cases the VM can be started at any time

and will be interrupted only at the request of the cus-

tomer, or in case of technical failure.

The preemptive VMs, however, can be revoked

anytime by criteria that are out of control by the cus-

tomer, such as market price, lack of idle resources and

others deﬁned by the provider. The cost of preemp-

tive VMs can reach a 90% discount, on AWS, and an

80% discount on GCP. In the scenario executed by

this study, a discount of 68% was obtained in relation

to on-demand execution on AWS and a discount of

26% on GCP.

Although the AWS discount was greater than what

was achieved on GCP, as all AWS cluster nodes were

preemptive VMs, that increased the risk of not ﬁn-

ishing the work or that have an increase on execution

time caused by node revocations.

The performance and cost measurements per-

formed in this study add value to applications that can

beneﬁt from the use of transient servers (AWS spot

instances and GCP preemptive instances), thereby

achieving a cost reduction, with little additional ef-

fort to automate the replenishment of instances that

are claimed by the provider, when it is the case.

Future work is intended to analyze other transient

server offers, as Microsoft Azure Low-priority VMs,

and to evaluate other execution scenarios whose ap-

plications can deal with VM revocations and could

beneﬁt for cost reductions offered by the use of tran-

sient servers.

REFERENCES

Agmon Ben-Yehuda, O., Ben-Yehuda, M., Schuster, A.,

and Tsafrir, D. (2013). Deconstructing amazon ec2

spot instance pricing. ACM Transactions on Eco-

nomics and Computation, 1(3):16.

AWS (2015a). Ec2 spot bid advisor. [online]. available

from:. https://aws.amazon.com/ec2/spot/bid-advisor/.

2017-05-17.

AWS (2015b). Ec2 spot-ﬂeet. [online]. available

from:. http://docs.aws.amazon.com/AWSEC2/latest/

UserGuide/spot-ﬂeet.html. 2017-05-17.

AWS (2017a). Amazon elastic block store. [online]. avail-

able from:. https://aws.amazon.com/ebs/. 2017-06-

01.

AWS (2017b). AWS EC2 Instance Types. [Online]. Avail-

able from:. https://aws.amazon.com/ec2/instance-

types/. 2017-10-12.

AWS (2017c). AWS Free Tier. [online]. available from:.

https://aws.amazon.com/free/. 2017-05-17.

Chhetri, M. B., Lumpe, M., Vo, Q. B., and Kowalczyk,

R. (2017). On estimating bids for amazon ec2 spot

instances using time series forecasting. In Services

Computing (SCC), 2017 IEEE International Confer-

ence on, pages 44–51. IEEE.

Chohan, N., Castillo, C., Spreitzer, M., Steinder, M.,

Tantawi, A. N., and Krintz, C. (2010). See spot run:

Using spot instances for mapreduce workﬂows. Hot-

Cloud, 10:7–7.

Coutinho, E. F., Rego, P., Gomes, D. G., and de Souza, J. N.

(2012). An

alise de Desempenho com Benchmarks em

um Ambiente Publico de Computac¸ ao em Nuvem. In

Performance and Cost Analysis Between On-Demand and Preemptive Virtual Machines

177

Proc. of the X Workshop em Clouds e Aplicac¸

oes. Belo

Horizonte, pages 96–109.

Gartner (2016). Magic Quadrant for Cloud Infrastruc-

ture as a Service, Worldwide. [online]. available

from:. https://www.gartner.com/doc/ reprints?id=1-

2G2O5FC&ct=150519. 2017-05-17.

Google (2017). Google preemptible instances. [online].

available from:. https://cloud.google.com/compute/

docs/instances/preemptible. 2017-01-07.

Huang, H., Wang, L., Tak, B. C., Wang, L., and Tang, C.

(2013). Cap3: A cloud auto-provisioning framework

for parallel processing using on-demand and spot in-

stances. In Cloud Computing (CLOUD), 2013 IEEE

Sixth International Conference on, pages 228–235.

IEEE.

Intel (2017). Linpack - Intel math kernel. [online]. available

from:. https://software.intel.com/en-us/mkl. 2017-05-

17.

Iozone (2017). Iozone - ﬁlesystem benchmark. [online].

available from:. http://www.iozone.org/. 2017-05-17.

Iperf (2017). Iperf. Network performance benchmark. [on-

line]. available from:. http://iperf.sourceforge.net/.

2017-05-17.

Jain, R. (1991). The Art of Computer Systems Performance

Analysis: techniques for experimental design, mea-

surement, simulation, and modeling.

Khandelwal, V., Chaturvedi, A., and Gupta, C. P. (2017).

Amazon ec2 spot price prediction using regression

random forests. IEEE Transactions on Cloud Com-

puting.

Mell, P., Grance, T., and others (2011). The NIST deﬁnition

of cloud computing.

Microsoft (2017). Azure batch low-priority vms. [online].

available from:. https://docs.microsoft.com/azure/

batch/batch-low-pri-vms. 2017-12-20.

Sabyasachi, A. S., Kabir, H. M. D., Abdelmoniem, A. M.,

and Mondal, S. K. (2017). A resilient auction frame-

work for deadline-aware jobs in cloud spot market. In

Reliable Distributed Systems (SRDS), 2017 IEEE 36th

Symposium on, pages 247–249. IEEE.

Sharma, P., Guo, T., He, X., Irwin, D., and Shenoy, P.

(2016). Flint: batch-interactive data-intensive pro-

cessing on transient servers. In Proceedings of the

Eleventh European Conference on Computer Systems,

page 6. ACM.

Sharma, P., Irwin, D., and Shenoy, P. (2017). Portfolio-

driven Resource Management for Transient Cloud

Servers. Proceedings of the ACM on Measurement

and Analysis of Computing Systems, 1(1):5.

Singh, R., Sharma, P., Irwin, D., Shenoy, P., and Ramakr-

ishnan, K. (2014). Here today, gone tomorrow: Ex-

ploiting transient servers in datacenters. IEEE Internet

Computing, 18(4):22–29.

Wolski, R. and Brevik, J. (2016). Providing statistical re-

liability guarantees in the aws spot tier. In Proceed-

ings of the 24th High Performance Computing Sym-

posium, page 13. Society for Computer Simulation In-

ternational.

CLOSER 2018 - 8th International Conference on Cloud Computing and Services Science

178