CHOOSING THE RIGHT CLOUD ARCHITECTURE

A Cost Perspective

Uwe Hohenstein, Reto Krummenacher, Ludwig Mittermeier and Sebastian Dippl

Siemens AG, Corporate Technology, Otto-Hahn-Ring 6, D-81730 Muenchen, Germany

Keywords: Cloud Computing, Cost-driven Architecture, Cloud Application Design, Windows Azure Platform.

Abstract: Cloud computing offers IT resources and services as a utility, and enables a much quicker move to market at

much lower cost, arguably. The initial expenses for effort and hardware are indeed lower, and potential

growth is much easier handled due to the inherited elasticity. However, applications in the cloud can cause

significant operational costs - different from on-premises operational costs - and hence unpleasant surprises

if not architected right. Cost factors should thus become much more of a core consideration when

architecting for the cloud. Different scenarios that are discussed in this paper will show how different

architectural decisions result in significantly different operational costs.

1 INTRODUCTION

Cloud computing has emerged to be the current

highlight in terms of IT as a service. A smart idea is

in principle enough to start a new business

(Armbrust, 2010): no more need for large cost

expenditure, no need for over-provisioning and

wasting expensive resources, for not missing

potential new clients. The main benefits of cloud

computing, without going into technical details yet,

are the elasticity and high availability of (at least

theoretically infinite) hardware and software

resources, the pay-as-you-go pricing model, and the

self-service administration of the resources. In more

economical terms, cloud computing has a very

attractive benefit of turning CAPEX (capital

expenses) into OPEX (operational expenses).

Still, none of these features, functional or non-

functional, comes for free. A scalable architecture is

essential for leveraging scalable cloud infra-

structures (Hamdaqa, 2011), or in other words,

simply deploying existing enterprise software into

the cloud does not make the software any more

scalable or cloud-enabled. Cloud architecture best

practices are offered by most cloud utility providers

(e.g., Amazon AWS (Varia, 2010)) or Microsoft

Azure (Pace, 2010) with illustrations of how to

design for failure, how to leverage elasticity, how to

decouple components and parallelize etc. These

important guidelines of how to bring existing and

new applications to the cloud are common and valid

for all cloud infrastructure offerings, although

optimal software engineering decisions might

certainly depend on the particular cloud utility for

which one implements the cloud-enabled

application.

There is, however, one important aspect, as we

will argue throughout this paper, which is (too often)

forgotten, when specifying solution architectures for

the cloud: the operational costs of running an

application in the cloud. In particular from an

enterprise perspective, the maintenance and

operations costs are highly relevant, and they should

thus have a significant impact on design decisions,

as we exemplify and discuss in this paper. The total

costs of running an application are comprised of

various individual sources such as the charges for

compute instances, storage, bandwidth or different

additional services. Depending on the cost model,

one or the other individual cost source will dominate

the overall bill, and reducing the total cost can only

be done when minimizing the use of these

dominating resources, already when defining the

architecture. Consequently, when architecting for the

cloud, cost factors need to be taken into account, and

one might consider extending the “4+1 Architectural

View Model” by (Kruchten, 1995) with an

operational cost view. While a modular design helps

to reduce maintenance costs and easy evolution, the

operational cost view would enable an architect to

illustrate the impact of the architectural decision on

the overall expenses (Käfer, 2010a).

334

Hohenstein U., Krummenacher R., Mittermeier L. and Dippl S..

CHOOSING THE RIGHT CLOUD ARCHITECTURE - A Cost Perspective.

DOI: 10.5220/0003918803340344

In Proceedings of the 2nd International Conference on Cloud Computing and Services Science (CLOSER-2012), pages 334-344

ISBN: 978-989-8565-05-1

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

The similar line of arguments was expressed by

Todd Hoff on HighScalability.com: “Instead of

asking for the Big O complexity of an algorithm

we'll also have to ask for the Big $ (or Big Euro)

notation so we can judge an algorithm by its cost

against a particular cloud profile.” (Hoff, 2009). It

shows that while turning to cloud deployments cost-

centric architectures becomes even more important,

as the costs are more obviously accountable.

Although, we are certainly by no means arguing that

architectures should be determined by cost

measures, we emphasize that discussions about

architectural alternatives, about pros and cons with

regard to costs have to be taken into account much

more prominently when choosing the appropriate

architecture for the cloud.

Unfortunately, existing work on cost-centric

architectures is very few (cf. Section 5), and most

publications and white papers rather relate to a Total

Cost of Ownership (TCO) comparison between on

premise and cloud deployments, not taking into

account the actual architecture; at business-cloud.de,

the Experton Group has published a TCO Calculator

that helps in assessing the cost advantages of

deploying in the cloud.

With this paper we want to counterwork this

trend, and showcase with concrete examples how

architectures impact the operational costs, once the

decision to work in the cloud has been taken. As a

technical basis for our work we are using the

Windows Azure platform and the corresponding

pricing models. The main reason for working with

Windows Azure in the context of this paper is the

comprehensive PaaS offering that ships with a

complete development and deployment environment

and various relevant by-products such as persistent

storage, access control, or distributed cache. This

has also the advantage that there are no problems

with the licensing of such products, as these are part

of the platform and the cost model. The latter,

moreover, makes the calculation of the architecture-

dependent costs much easier.

In order to clarify the baseline, we continue the

paper with a short introduction to the core concepts

of Windows Azure and its pricing model in Section

2. Then, we present two scenarios that are derived

from real-world business cases, and based on that we

will discuss and analyze the different architectural

alternatives in Sections 3 and 4, respectively. In

Section 5, we outline some related work with cost-

centric aspects. During the practical part of our

investigation, we detected some recommendations

that are worth being reported on in Section 6, before

the paper is concluded with Section 7.

2 WINDOWS AZURE AND ITS

PRICING MODEL

In this section, we give a short introduction and

overview of the core concepts of Windows Azure

including the pay-per-use model. Pricing details

reflect the status quo when writing this paper and

will certainly change again in the future. However,

regardless of the pricing details of a specific cloud

computing platform, the baseline argumentation of

this paper remains the same.

2.1 Core Concepts

Windows Azure provides virtual machines, so-called

compute instances that run Windows Server 2008

and are available in two forms: a Web Role hosts an

IIS (Internet Information Server) and is foreseen to

provide the front-ends for web applications such as

ASP.NET. In contrast, a Worker Role does not

possess an IIS and serves mainly as a host for

backend processes. The Web Roles offer different

thread modes that can be configured, e.g., to have a

thread pool with delegating each request to the next

thread. In contrast to AWS, the compute instances

are redundant with a built-in failover mechanism.

Both types of compute instances can initiate

Internet connections, however, instances of Web and

Worker Roles are not directly accessible via the

Internet. All network traffic coming from outside to

Web and Worker Role instances goes through a load

balancer; each role can specify an endpoint con-

figuration by which protocol (e.g., HTTP(S)) and by

which port it should be accessible. Incoming traffic

is routed to role instances in a round robin fashion.

As a consequence, if there is more than one instance

of a Web Role, subsequent requests will be routed

by the load balancer to different instances. Therefore

it is not an option to use the local file system of a

Web Role for storing HTTP session data. Rather, the

Azure storage mechanisms, which are table storage,

queue storage and blob storage, need to be used for

such kind of data that needs to be processed in

subsequent requests. Similarly, SQL Azure, a

managed SQL service in the cloud, can be used.

Azure table storage allows for storing data in a

manner that is similar to tables, however, it does not

enforce a fixed scheme; a row consists of a couple of

properties and values, which are stored, without any

predefined structure. Azure queue storage allows for

FIFO-style message passing between role instances.

Each message can be up to 64 KB in size. Finally,

Azure blob storage allows for storing binary data

such as images or videos, which can be annotated

CHOOSINGTHERIGHTCLOUDARCHITECTURE-ACostPerspective

335

with metadata. All the Azure storage services can be

accessed via a RESTful interface; i.e., an HTTP

protocol-based web API. This way, all programming

languages with support for HTTP can use of the

Azure storage capabilities, from inside the cloud or

outside. Apart from that, the Windows Azure storage

client library provides a more comfortable way for

accessing the Azure storages.

An application built for Windows Azure runs in

the context of a so-called hosted service, which

defines for instance a public URL prefix as well as

the geographical region. Windows Azure applicat-

ions are uploaded (deployed) to the public cloud

environment via the Azure web-based self-manage-

ment portal to a specific hosted service, either to a

production deployment or a staging deployment. The

production deployment is accessible via the public

URL of the hosted service whereas a deployment

that is uploaded to the staging area is for testing

purposes and thus only accessible via a URL gener-

ated by Azure. Staging and production deployments

can be swapped without service downtime.

2.2 Standard Rates

The standard rates for Windows Azure can be found

in http://www.microsoft.com/windowsazure/offers

/MS-AZR-0003P as of January 2012.

Compute instances, i.e., Web and Worker Roles,

are charged for the number of hours they are

deployed. Even if a compute instance is used for 5

seconds, a full hour has to be paid. There are several

instance categories, small (S), medium (M) etc. As

Table 1 shows, the instance categories scale in a

linear manner with regard to equipments and prices.

That is, a medium instance has double of CPU, disk

etc. than a small instance resulting in a double price.

The exception is an XS instance category.

Table 1: Prices for compute instances.

CPU RAM HDD

(GB)

MBps $ / h I/O

performance

Shared 768MB 20 5

0.04 Low

1,6GHz 1,7 GB 225 100

0.12 Moderate

2 x 3,5 GB 490 200

0.24 High

4 x 7 GB 1000 400

0.48 High

8 x 14 GB 2040 800

0.96 High

For Azure table, blob and queue storages, the

costs depend on bandwidth, transaction, and storage

consumption. Storage is billed based upon the

average usage during a billing period of blob, table,

and queue storage. For example, if 10 GB of storage

are utilized for the first half of the month and none

for the second half of the month, 5 GB of storage are

billed for average usage. Each GB of storage is

charged with $0.14 per GB. Storage consumption is

measured at least once a day by Azure. Please note

that the storage consumption takes into account the

physical storage, which consists not only of raw

data, but also the length of the property names, the

data types, and the size of the actual data.

Moreover, any access to storage, i.e., any trans-

action, has to be paid: 10000 storage transactions

cost $0.01. Bulk operations, e.g., bundling several

inserts in one operation, count as one transaction.

All inbound data transfers to the Azure cloud are

at no charge since Spring 2011. The outbound

transfer to the North America and Europe regions is

charged with $0.12 per outgoing GB, the Asia

Pacific Region is more expensive. It is important to

note that the transferred data has some typical XML

overhead according to the protocol.

Data transfer is for free within the same affinity

group, i.e., for compute instances that run in the

same data center. The affinity group can be specified

in the Azure self-service portal.

The costs for SQL Azure are also based upon

monthly consumption. The Web Edition costs $9.99

per month for up to 1GB, and $49.95 for up to 5GB.

The Business Edition allows for larger databases

with similar prices: $99.99 per month for each 10GB

(up to 50 GB).

Finally, we want to mention the Azure Access

Control Service for authentication, which is charged

with $1.99 per 100,000 transactions.

There are also some flat rates where a fixed

number of compute instances is paid. For instance, a

6-month commitment (http://www.Microsoft.com

/windowsazure/offers) mostly offers a 20% off rate

for resources. In case the given quotas (e.g., 750 free

compute hours) are exceeded, standards rates apply

for overages. Furthermore, special offers exist for

MSDN subscribers, BizSpark, or MPN members.

Those specific rates are out of scope for this paper.

2.3 Special Quotas and Limits

There are some quotas active that define upper

thresholds. For instance, every account may run 20

concurrent small compute instances (which is equal

to 10 medium instances or 5 large instances) and

possess 5 concurrent storage accounts, each having

its own credentials for access. Higher numbers can

be ordered, however, require negotiation with the

Azure customer service. Besides this, there are a

couple of technical restrictions such as the payload

limit of 64 K for queues.

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

336

3 SCENARIO 1: MASS DATA

STORE

This paper relies on two typical scenarios that occur

quite often in reality. However, the scenarios were

simplified in order to ease the discussion and to

obfuscate the business details.

The first scenario is concerned with mass data

storage. Several data providers (DPs) of given

organizations provide data for a cloud-based mass

data storage. The data in the cloud storage is used

and processed by applications for analysis or other

purposes; e.g., business intelligence or production

process optimization. A more concrete example is a

fleet management system that manages cars; each

car sends data about its current state or position to a

central cloud service. In this case the organizations

are car fleets, the data providers are individual

vehicles or fleet owners, and the collected data is

processed further on to optimize fleet usage or the

traffic management.

The following discussion is based upon several

assumptions; most of them will be relevant for the

presented cost calculations:

• There are 5 organizations with 20 DPs each for a

total of 100 DPs. Each DP sends 10 data items à 1

KB per second to storage. Both the frequency of

10 items per seconds and the payload are assumed

to be constant (varying loads are discussed later in

Subsection 3.5). In summary, 1000 items are thus

arriving per second (100 DPs * 10 items/s) for a

total payload of 1000 KB/s.

• No data will be removed; there is an increasing

amount of data maintained in storage.

• There is a transport latency from outside the cloud

to the inside and vice versa, which might of course

vary depending on the overall network congestion.

However, the impact on the architecture is

neglectably small since we assume an asynchr-

onous HTTP communication link between data

providers and the cloud storage. Hence, the DPs

are just firing without waiting for a confirmation.

Please note the main purpose of this paper is neither

to present a particular application and its costs, nor

to define the cheapest architecture for such. The

given numbers are (realistic) assumptions taken to

calculate and compare occurring costs in the cloud.

Moreover, it is not the intention to assert a certain

type of architecture, rather we want to show how

architectural choices can affect costs more or less

dramatically. We also recognize that changing the

assumed numbers and SLAs could lead to different

costs and ranking of architectures. There are also

further possible architectures that are not discussed.

3.1 The Web Role Approach

A couple of Web Roles (with threads running in it)

receive data from all data providers, no matter of

what organization. Threads of an appropriate

number of Web Roles store data into organization-

specific storages. Thus, every organization has some

cloud storage of its own to maintain its data – not at

last due to security considerations: an organization’s

data must not be accessible by other organizations.

According to the incoming load, more or less Web

Roles can be started, having the IIS load balancer in

front of them.

For the purpose of this paper, we fix some

further system parameters:

• Small compute instances are taken for the Web

Roles.

• We assume that each Web Role can run 10 threads

without system overload. This is a reasonable

number that corresponds to Microsoft’s recom-

mendations (Best Practices, 2011). Our tests have

shown that small Azure compute instances are

already quite busy having 10 threads running.

• Referring to the benchmarks in (MS Extreme

Computing Group, 2011), we assume that storing

data from a Web Role into cloud storage is

typically done in 30ms.

• For client authentication, authorization, and data

pre-processing, some additional 40ms are assumed

at the Worker Role, including a database access

for getting the credentials.

Summing up, this means that the processing of each

incoming storage request in a Web Role has some

70ms compute latency including all storage

accesses. As a consequence, one Web Role thread is

able to handle about 14 requests in a second.

Handling the 1000 incoming items/s (10 data items

per second from 100 data providers) thus requires

minimally 70 threads. According to the assumption

that each Web Role can run 10 threads, 7 Web Roles

with 10 threads each are needed to handle the

requested throughput; otherwise the IIS queue of the

Web Roles will fill up, letting data providers

experience more and more latency. With a constant

load, the IIS queues will never be able to shrink,

which moreover increases the risk of losing data.

According to (Microsoft Extreme Computing

Group, 2011), any Azure storage solution should be

able to handle a write throughput of 1000 items/s

performed by 7 Web Roles with 10 threads.

As stated previously, essential for this paper are

the operational costs of architectures. The monthly

costs for this first solution are as follows:

• The complete inbound traffic to the Web Role is

free of charge; since July 2011.

CHOOSINGTHERIGHTCLOUDARCHITECTURE-ACostPerspective

337

• Seven small Web Roles à 12ct per hour for 30

days cost $604.80.

• Table storage (no removal assumed) with a daily

increase of 82.4 GB (1 GB/month à 14ct) results

in further $178.81 if we consider the worst case

that Azure monitors storage consumption at the

end of a day: 82.4 GB for the 1st day, a total of

2*82.4 GB for the 2nd day etc. sum up to 38316

GB in a month.

• 1000 storage transactions per second lead to

2,592,000,000 per month à 1ct per 10000: $2592.

The total costs are $3375.61. A quick conclusion

shows that transactions produce the main costs; Web

Roles and the storage also affect the costs.

An aspect not yet discussed is access control and

security. Authentication becomes necessary when

working with Web Roles, as those are able to access

all storage components directly. In this architecture,

authorization/authentication can be performed by the

Web Role, which is both an advantage and a

disadvantage: On the one hand, this provides better

flexibility. But on the other hand, an additional

authorization/authentication component is required

that incurs further costs, either for using Azure

Access Control ($1.99 for 100,000 transactions) or

implementing one’s own component. Anyway, the

Web Roles have access to all cloud storages since

they serve all organizations.

Of course, the Web Roles can also perform some

pre-processing, for example, extracting data from

XML input, transforming data, or condensing data.

3.2 Queues at the Front-end

In an alternative architecture, each organization

obtains one dedicated cloud queue at the frontend.

Data providers of each organization then put data

items directly into their respective queue using the

provided REST interface for the queue storage.

Since there is no longer a front-end Web Role,

authorization and authentication becomes an issue: it

must be ensured that a data provider is only allowed

to store in the queue of its organization. In Azure,

the credentials are bound to a storage account, i.e.,

all queue or table storages belonging to the same

account share the same credentials. This implies that

each organization would require an account of its

own as otherwise every DP would inherently get

access to all queues. The quota of five storage

accounts that are granted per Azure account are just

sufficient for our example; otherwise additional

storage accounts would have to be explicitly

requested, however, without any further expenses.

Threads in a Worker Role pick up data items

from the queues and transfer them to cloud storage.

The number of requested Worker Roles (threads)

depends on the time for emptying queues and on the

required timeliness of data in cloud storage. In fact,

the queue length must be close to empty, otherwise

the queue will permanently increase since the

assumed load is constant. If data must be up-to-date

in cloud storage within fractions of a second, more

Worker Roles (threads) are required to perform the

transfer. However, data provider throughput is not

throttled by a too low number of Worker Roles.

There is no risk of data losses since queues are

persistent, but an overflow might become critical.

We assume a queue read latency of 30ms and a

typical storage write latency of 30ms according to

(Microsoft Extreme Computing Group, 2011) for a

total of 60ms. Then, one Worker Role thread is able

to transfer an average of 16.67 items per second; 60

Worker Role threads distributed over 6 Worker

Roles (because of the 1/10 Worker Role/thread ratio)

are required to keep pace with each of the 5 queues

being filled up with 200 items/s. It does not matter

whether Worker Roles are assigned to specific

queues or serve all queues.

Scalability with regard to incoming data is

limited only by the queue throughput. The requested

200 items/s are easily achievable by Azure cloud

queues according to (Microsoft Extreme Computing

Group, 2011). If necessary, more queues could be

set up, e.g., one for each data provider; however, to

note again, the quota for storage accounts is five.

The number of queues and accounts does not affect

the total operational costs as only the queued data

and the transactions are charged but not the number.

The monthly costs for such a queue-based

architecture are computed as follows:

• Incoming requests to the front-end queue are again

for free.

• The background storage costs remain at $191.58.

• The storage transactions for background storage

are also the same $2592 as before.

• There are five newly introduced front-end queues

with each queue getting in average 200 messages

per second. As already mentioned, the Worker

Roles will empty the queues in order to keep pace

with the input stream. But even if there are 10

messages in the queue at any point in time,

requiring 50KB storage (5 queues * 10 KB) over

30 days, results in the micro-costs of 0.0007 ct.

• There are three kinds of inbound and outbound

transactions for the queues, one to read a message,

another to store, and a third one to delete the

message; Azure does not offer a mean to read and

delete with one operation. This means enormous

costs of $7776 = 3 * $2592.

• Six Worker Roles are used each for a price of 12ct

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

338

per hour for 30 days: $518.40.

Concluding the calculation, we quadruple the

transaction costs from $2592 (Subsection 3.1) to

now $10268 with the benefit of reducing 7 Worker

Roles to 6 Web Roles and saving 86.40$ in a month

for computation ($518.40 instead of $604.80). In

addition, there are smallest amounts of costs for

queue storage (0.0007ct). Hence, this architecture

produces costs of $11077.98 per month and is thus

about $7700 more expensive than the previous one.

Technically speaking, this architecture has some

advantages. First, queues allow for more flexible

reactions to load changes. Queues can fill up

(without causing dominating costs) to be emptied at

later points in time, during low load times, if no time

critical data is involved. Consequently, an

interesting alternative to the proposed setting could

have the queues fill up due to fewer Worker Roles,

and use – if cheaper – operating hours at night to

transfer the data items to the backend storage.

Having storage queues filled up does not call upon

the same risks as IIS queues, as storage queues are

persistent. Second, the architecture can rely on

Azure queue authentication as a queue belongs to

only one organization, and the data providers of one

organization can only fill their organization’s queue.

However, authentication becomes less flexible.

As another disadvantage, additional implement-

ation effort is required to set up the Worker Roles in

a multithreaded manner. In contrast, multithreading

is for free in Web Roles because of configurable

instantiation models. Moreover, the transfer has to

be fault-tolerant due to the lack of storage-spanning

transactions, deleting data in the queue and inserting

it into the backend storage. And the implementation

must be able to determine what data from the queue

can be skipped if a crash occurs after transferring to

the backend but before deleting in the queue.

A further disadvantage of this architecture is the

fact that the payload of queue messages cannot

exceed a 64 KB threshold in Azure. Hence, if the

payload is unknown or might increase, a complete

redesign is required: one possibility is to use blobs

for storing data, and to put a reference (URI) to the

blob into the queue. This causes additional storage

and transaction costs for blobs and an additional

delay for data providers due to blob handling.

3.3 Bulk Operations

This architecture is based upon the previous one,

however, attempts to reduce the number of

expensive transactions by means of bulk operations.

Azure provides to this end a mean to build bulks of

operations of the same kind.

At the front-end, there is no opportunity unless

the data providers collect data in bulks and submit

bulks to the queues. However, bulk operations can

be used during the internal processing: a Worker

Role is able to fetch bulks of items from the queue,

to remove them in bulks, and to submit bulks to the

backend storage. This will in fact cause some delay

in processing and lacks a little of less timeliness.

Moreover, some implementation overhead occurs

since it is necessary to wait for complete bulks.

Some fault-tolerance is again required: a Worker

Role might crash while just having cached a bulk.

Data is not lost in that case since queues are

persistent and still contain the data.

Even if the bulk size for queues is limited to 32

at maximum, it is possible to divide the transaction

costs drastically by 32. However, the queue API

offers only the possibility to get data in bulks, but

not to delete bulks. Consequently, the cost for

getting data from the queue can be reduced from

$2592 to $81, but the other transactions stay at

$2592. Moreover, bulk operations are possible for

the backend storage. The table storage offers writing

bulks operations of at most 100 entities and 4MB of

size. This also reduces transaction costs from $2592

to $25.92 for retrieval. Compared to Subsection 3.2

the total transaction costs of $5290.92 (2*$2592 +

$81 + $25.92) remain high.

As an alternative, table storage could be used

instead of queues. It offers bulk operations even for

reads, writes and deletes. The challenge now is to

mimic the queue behavior. One possible way is to

use table storage for each organization and the data

provider’s id as a partition key. Hence, it is easily

and efficiently possible to fetch the eldest 100 data

items for a given data provider (using the timestamp

in a query), to store those items in the backend

storage, and to remove them. In fact, there is some

implementation effort, e.g., to be sure that a bulk of

100 is available in order to avoid polling, and to

coordinate the Worker Role threads, i.e., who is

accessing which table. The transaction costs for the

Worker Role can be divided by 100 from $7776 to

$77.76. This makes the solution a little cheaper than

3.1 since $86.40 for compute instances are saved.

3.4 Direct Access to Cloud Storage

Another approach gets rid of compute instances, i.e.,

Web or Worker Roles, in order to save costs. Data

providers can store their data directly into blobs or

tables; the post-processing applications then access

the data provider’s storage directly.

CHOOSINGTHERIGHTCLOUDARCHITECTURE-ACostPerspective

339

Both blob and table storage are possible in this

type of architecture. However, blob storage has an

important advantage over table storage: it offers

fine-granular security rights. Blobs are stored in

containers and the access rights of each container

can be controlled individually even if the containers

belong to the same account. In contrast, table stores

of the same account share the same credentials.

Hence, blobs are used in the following, each organi-

zation obtaining a container of its own. Note that the

number of containers does not affect the operational

costs.

The throughput depends on the access

capabilities of blobs; the requested throughput of

200 items/s for each organization/container should

be possible. Otherwise, additional accounts or

containers have to be ordered.

The costs in the first month are here as follows:

• The data storage costs remain the same, and sum

up to $191.58.

• The costs for storage transaction are still at $2592.

The conclusion is quickly made. With less than

$2784 operational costs in the first month, this is the

by far cheapest architecture – if applicable. The

major benefit of this architecture is in fact the

reduction of compute instances.

While financially the clear winner so far, techni-

cally this approach brings along several dis-

advantages. First, the backend storage is not shielded

from data providers and the system fully relies on

the authentication of the storage only. Furthermore,

the same storage technology must be appropriate for

both data providers and processing applications at

the backend, but both might have different demands

with regard to throughput or query functionality. If

blob storage (or table storage alternatively) does not

offer the requested functionality for backend

applications, data will have to be transferred into an

alternative cloud storage, which again requires

additional Worker Role(s) and lets become the

architecture similar to Subsections 3.2 or 3.3.

3.5 Load Variations

So far, we have discussed some constant load. We

modify this assumption by assuming the same

overall load per day, however varying over the day.

For example, the load in a typical fleet management

might be higher at 8-9 am and 5-6 pm.

Referring to Subsection 3.1, the IIS queues for

Web Roles fill up during heavy load. The requested

throughput must be handled by setting up additional

Web Roles; the costs should be similar to a constant

load if the data amount and transactions are the same

over the whole day, i.e., there are less Web Roles at

non-peak times. However, we pay a Web Role for

one hour least. Such an hourly rate could produce

higher costs! Moreover, we have to bear in mind the

time for provisioning compute instances.

In Subsection 3.2, the front-end queues fill up,

but no more Worker Roles are required since the

queues are persistent. If there are timeliness

constraints, i.e., if data must be mostly accurate in

the back-end store, additional Worker Roles can

reduce queues. One important question is whether

the throughput of the front-end storage is enough.

Well, there is still the opportunity to react on too

high load with setting up more queues, which

requires much effort if to be performed online.

The same holds for the architecture in 3.3: if the

throughput of the front-end storage is not sufficient,

higher load could be handled with more accounts.

Handling load changes by the number of Web/

Worker Roles, an hourly high load is more positive

than arbitrary load changes since charging is done

for full hours. In this respect, Worker Roles are more

advantageous, because there is a chance of having

less Worker Roles: input throughput can be handled

over a long period of time without corrupting the

required throughput. If the payment model offers a

reduced overnight rate, there will also be a chance of

using Worker Roles over night at less cost.

4 SCENARIO 2: DATA DELIVERY

The second scenario has been originally presented in

(Käfer, 2010b). We suppose a large scale data

delivery service being managed in the cloud: data is

pushed into the system and is maintained in some

central cloud storage. At the front end, customers

expect to obtain their specific data from a cloud-

based delivery service. An example could be found

in logistics where post orders to a wholesale chain

need to be collected, centrally managed and

forwarded to individual suppliers and freight carrier

services. In order to better model this scenario, we

assume the backend storage to be filled once in the

morning by some data provider for the purpose of a

higher throughput. We again postulate some basic

assumptions:

• There are 16000 clients receiving items: 0 items

for 8000 clients (50%), 1 item for 3200 (20%) and

2 items for 3200 (20%), and 5 items for 1600

(10%). This sums up to 17600 items per day

((3200*1 item + 3200*2 items + 1600*5 items).

• Since each item has a payload of 50 KB, a total

daily payload of 880000 KB is produced.

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

340

• Searching one item in the storage takes 300 ms

even if none is found.

• 4000 clients all want to fetch their items at 8 am,

12 am, and at 5 pm; 4000 clients are equally

distributed over the remaining times.

• As an SLA, clients should not wait longer than 1.5

seconds for being served.

4.1 The Web Role Approach

In the first architecture, several Web Role threads

serve the clients: clients queue up in the load

balancer in order to ask a Web Role for data: a Web

Role thread accesses the storage to determine the

data for that client and deliver the data while the

client is waiting. The appropriate number of Web

Roles depends on the number of clients and the

given SLAs to clients.

We first focus on the three peak load times: the

architecture has to serve 4000 clients at each peak

time with clients of four types A to D:

A. 800 clients accessing 1 item (served within 300

ms)

B. 800 clients accessing 2 items (within 600 ms)

C. 400 clients accessing 5 items (within 1500 ms)

D. 2000 clients accessing 0 items (within 300 ms)

At first, we need to calculate the number of Worker

Roles that are required to satisfy the SLA of being

served within 1.5 sec. The number of Worker Roles

obviously depends on the arrival of client. The

lowest number of Worker Roles is required in the

following situation:

• 400 times: a client of type C arrives and is served

in 1500ms; each C client requires an own thread.

• 400 times: a sequence of client types B,D,D,D (the

last client of type D finishes before 1500 ms)

• 400 times a sequence of B,D,D,A

• 80 times a sequence of A,A,A,A,A

This optimal schedule is rather unrealistic because it

usually depends on the arrival and the load balancer.

Even in this best case 1280 Web Role threads

(400+400+400+80) are required all together. This

results in 128 Web Roles with 10 threads each.

The costs can be calculated as follows for each

peak time a day (there are three peaks a day):

• 128 Web Roles: Although the Web Roles are only

required for 1.5 seconds, we have to pay for the

full hour à 12ct/h, i.e., $15.36.

• Storage transactions are required for getting and

deleting data. The costs are 0.8ct (4000 clients * 2

accesses * 1ct/10000).

• Outbound data transfer: 4400 items have to be

delivered at each peak time for 3.15ct (4400 items

* 50 KB * 15ct/GB).

• The backend storage is out of scope here.

Hence, we are charged $15.40 for each of the 3 peak

times, i.e., $46.20 for all peak times. In addition, 1

further Web Role is needed for the remaining non-

peak time of 21 hours:

• The Web Role costs $2.52 (21 hours * 12ct).

• Storage transactions (2 times for get/delete): 0.8ct

(4000 clients * 2 accesses * 1ct/10000).

• Outbound data transfer: $2.97 (4400 items * 50

KB * 15ct/GB = 3.3ct).

The total costs are $51.70 ($46.20 + $5.50) per day.

The major disadvantage lies in the fact that every

client checks periodically for newly received data

even if none has arrived. This produces a lot of load

which in turn requires Web Roles.

4.2 Storage-based Architecture

As a storage-based alternative, we introduce a client-

specific storage: there is one account for each client

in case of a table or queue store; using blob storage

and client-specific containers requires only one

global account for all clients, but client-specific

credentials for containers. Worker Roles fetch data

from the global storage and distribute the data to

those client-specific storages. Clients remove their

data from this storage right after pick up.

The client service time depends on the transport

and access latency for storage. If one blob storage

account is used for all client blobs, 4400 accesses

occur at each peak time. In fact, according to

(Microsoft Extreme Computing Group, 2011), the

available throughput is enough to fulfill the SLA

that the waiting time for being served does not

exceed 1 sec.

The number of Worker Roles and their starting

time is only important to deliver items before each

peak time; obviously, starting Worker Roles earlier

reduces the number of required Worker Roles.

Furthermore, the delivery of messages into the

global storage from outside is important. We here

assume that data delivery has finished before any

client wants to receive his data.

If one Worker Role performs a “full scan” on all

incoming 17600 items once a day and assigns the

items to the client storages, then retrieval takes less

than 88 min (5280 sec = 17600 * 300 ms). If one

Worker Role is started with 10 threads, then the

Worker Role must start 9 minutes before the first

peak time. Afterwards, all the items are distributed.

The number of Worker Roles and threads is

mostly irrelevant, since a Worker Role is paid for

each hour in use according to the payment

conditions. The more threads (or Worker Roles) are

applied, the later processing can start, which is of

CHOOSINGTHERIGHTCLOUDARCHITECTURE-ACostPerspective

341

few benefit only. However, a strategy to define

which thread searches what becomes necessary.

Let us now calculate the overall costs for one

day, assuming that client storages are filled in one

step once a day, being ready for the first client.

• The number of Worker Roles is mostly irrelevant;

it determines just when to start delivery at the

latest: 1 Worker Role with 10 threads is enough

and costs 12ct a day.

• Additional storage costs arise for the client

storages from the beginning to the point a client

fetches items. A precise calculation is unnecessary

since even the worst case of keeping the client

items the whole day is ignorable: storing 880000

KB to be delivered in the client storage for one

day costs less than 14ct a month and 0.5ct per day.

• The additional outbound traffic from the backend

store to the distributing Worker Role is for free, if

the storage and the Worker Role are in the same

data center region.

• The daily transactions (17600 gets and deletes à

1ct/10000 for the backend store, 16000 read

accesses for storage in client storage) cost 3.36ct.

• The outbound data transfer from the client storage

to the client costs 12.58ct (880000 KB * 15ct/GB)

a day.

As a conclusion, there is an enormous cost reduction

since much less Worker Roles are used than in

Subsection 4.1: this type of architecture produces

only 17ct per day instead of $51.

A variant of this architecture could transfer data

by the Worker Role threads before each peak time:

then less storage costs are consumed, but three

Worker Roles are required per day. Hence, there is

no benefit because of cheaper storage prices and

more expensive Worker Roles. This approach might

become useful if data will be delivered to the global

storage several times a day.

5 RELATED WORK

A number of researchers have investigated the

economic issues around cloud computing from a

consumer and provider perspective. Indeed,

(Armbrust, 2010) identifies short-term billing as one

of the novel features of cloud computing. And

(Khajeh-Hosseini, 2010) considers costs as one im-

portant research challenge for cloud computing. But

only little research has been done in this direction.

(Youseff, 2008) discusses three pricing models

that are used by cloud service providers: with tiered

pricing, different tiers each with different

specifications (e.g. CPU and RAM) are provided at a

different cost per unit time. A large tier machine has

better equipment but also has higher costs. Per-unit

pricing is based upon exact resource usage; for

example $0.15 per GB per month. Finally,

subscription-based pricing is common in SaaS

products such as Salesforce's Enterprise Edition

CRM that charges each user per month.

(Walker, 2009) performs cost comparisons

between cloud and on-premises. He states that lease-

or-buy decisions have been researched in economics

for more than 40 years. Walker compares the costs

of a CPU hour when it is purchased as part of a

server cluster, with when it is leased. Considering

two scenarios – purchasing a 60000 core HPC

cluster and purchasing a compute blade rack

consisting of 176 cores – the result was that it is

cheaper to buy than lease when CPU utilization is

very high (over 90%) and electricity is cheap. The

other way around, cloud computing becomes

reasonable if CPU utilization is low or electricity is

expensive. Walker focuses only on the cost of a

CPU hour. To widen the space, further costs such as

housing the infrastructure, installation and

maintenance, staff, storage and networking must be

taken into account as well.

(Klems, 2009) also addresses the problem of

deciding whether deploying systems in a cloud

makes economic sense. He discusses some economic

and technical issues that need to be considered when

evaluating cloud solutions. Moreover, a framework

is provided that could be used to compare the costs

of using cloud computing with an in-house IT

infrastructure. Unfortunately, two presented case

studies are more conceptual than concrete.

(Assuncao, 2009) concentrates on a scenario of

using a cloud to extend the capacity of locally

maintained computers when their in-house resources

are over-utilized. They simulated the costs of using

various strategies when borrowing resources from a

cloud provider, and evaluated the benefits by using

performance metrics such as the Average Weighted

Response Time (AWRT) (Grimme, 2008), i.e., the

average time that user job-requests take to complete.

However, AWRT might not be the best metric to

measure performance improvement.

(Kondo, 2009) examines the performance trade-

offs and monetary cost benefits of Amazon AWS for

volunteered computing applications of different size

and storage.

(Palankar, 2008) uses the Amazon data storage

service S3 for scientific intensive applications. The

conclusion is that monetary costs are high because

the service covers scalability, durability, and

performance, which are often not required by data-

intensive applications. In addition, (Garfinkel, 2007)

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

342

conducts a general cost-benefit analysis of clouds,

however, without any specific application.

(Deelman, 2008) highlights the potentials of

using cloud computing as a cost-effective

deployment option for data-intensive scientific

applications. They simulate an astronomic

application named Montage and run it on Amazon

AWS. Their focus was to investigate the

performance-cost tradeoffs of different internal

execution plans by measuring execution times,

amounts of data transferred to and from AWS, and

the amount of storage used. Unfortunately, the cost

calculation is not precise enough because of the

assumption that the cost of running instances on

AWS EC2 is calculated on a per-CPU-second basis.

However, AWS charge on a per-CPU-hour basis:

launching 10 instances for 1 minute would cost 10

CPU hours (not 10 CPU minutes) on AWS. They

found the cost of running instances (i.e. CPU time)

to be the dominant figure in the total cost of running

their application. Another study on Montage

(Berriman, 2010) concludes that the high costs of

data storage, data transfer and I/O in case of an I/O

bound application like Montage makes AWS much

less attractive than a local service.

(Kossmann, 2010) presents a web application

according to the TPC-W benchmark with a backend

database and compares the costs for operating the

web application on major cloud providers, using

existing relational cloud databases or building a

database on top of table or blob storages.

Hence, (Deelman, 2008) and (Kossmann, 2010)

are two studies that take roughly our direction.

6 RECOMMENDATIONS

We want to present a couple of recommendations

that we derived from our investigation.

While a couple of papers such as (Deelman,

2008) have identified compute instances as the

dominating cost factor, we have seen in the first

scenario that transactional costs cannot be neglected.

Here, bulk transactions could help, if applicable.

Indeed, compute instances are quite expensive,

however, if compared to storage. Hence, one should

try to minimize the number of compute instances, to

allocate only instances when really needed, and to

stop compute instances that are no longer needed.

Strategies to adopt the number according to the

recent load can help to react on varying load. But

caution has to be taken since collecting performance

and diagnosis data produces additional storage and

transaction costs. Note also that stopped compute

instances cause the same costs as running instances:

an instance should be deleted to avoid running costs

while still retaining the service URL.

It is also important that the costs for the staging

area are the same as for the production environment.

Hence, one should not forget to delete staging

deployments between or after test phases.

Acquired resources should be used efficiently.

For example, it is possible to run several web sites

and web applications in one Web Role thanks to full

IIS support. The more cores a compute instance has

(determined by the instance category), the more

parallel work a Web Role can handle. However, in

most cases, it is better to use smaller instance

categories such as XS or S: smaller instances offer a

better scaling granularity while costs scale in a linear

manner. In fact, there are also scenarios where a

higher equipment such as 8 CPUs (XL), larger main

memory, or bandwidth is reasonable, e.g., to allow

multi-core programming to a larger extent.

If the load seems to be quite constant, some

special offers such as a Subscription Offer

(http://www.Microsoft.com/windowsazure/offers),

enterprise agreements or long-term subscriptions

might be a choice to save costs.

7 CONCLUSIONS

Based on Microsoft’s Windows Azure platform

offering, we have argued in this paper for the

importance of taking operational costs into account

when designing the architecture for a cloud-based

offering. Given examples have shown the impact

particular design decisions have on the cost of cloud

applications; this is the important message of the

paper. However, the paper by no means values one

architecture decision over another but emphasizes

the importance of considering the use and type of

storage, compute instances and communication

services already at early stage, in particular with

respect to their impact on the operational costs.

The different architectural approaches and the

resulting overall costs that at least partly diverge

significantly reveal one important problem when it

comes to migrating software to the cloud; the many

dimensions of design decisions, and certainly the

many dimensions of the pricing models. The pricing

ladders not only differ between different providers in

terms of charging units, special offers and free

services, but already within the offers of single

companies, cf. davidpallmann.blogspot.com on

August 14, 2010: “The #1 hidden cost of cloud

computing is simply the number of dimensions there

CHOOSINGTHERIGHTCLOUDARCHITECTURE-ACostPerspective

343

are to the pricing model. In effect, everything in the

cloud is cheap but every kind of service represents

an additional level of charge. To make it worse, as

new features and services are added to the platform

the number of billing considerations continues to

increase”. In other words, there is much more to

cost-effective architecture design than choosing the

number and size of compute instances, or deleting

stopped staging deployments when not used.

While we have supported our arguments with

tangible examples and experiences we have gained

from working with Windows Azure, there is further

work required towards concrete guidelines and best

practices in cost-effective architecture design for the

cloud; also taking into account further features such

as Azure AppFabric Cache, special long-term

subscriptions, and other cloud offerings such as for

example Amazon AWS or Google AppEngine.

REFERENCES

Armbrust, M., Fox, A., Griffith, R., Joseph, A., Katz, R.,

Konwinski, A., Lee, G., Patterson, D., Rabkin, A.,

Stoica, I. and Zaharia, M. (2010): A View of Cloud

Computing. CACM, 53(4), April 2010.

Assuncao, M., Costanzo, A. and Buyya, R. (2009).

Evaluating the cost-benefit of using cloud computing

to extend the capacity of clusters. In HPDC '09: Proc.

of 18th ACM int. symposium on High performance

distributed computing, Munich, Germany, June 2009.

Calder, B. (2010). Understanding Windows Azure Storage

Billing – Bandwidth, Transactions, and Capacity.

http://blogs.msdn.com/b/windowsazurestorage/archive

/2010/07/09/understanding-windows-azure-storage-

billing-bandwidth-transactions-and-capacity.aspx.

Berriman, B., Juve, G., Deelman, E., Regelson, M. and

Plavchan, P. (2010). The Application of Cloud

Computing to Astronomy: A Study of Cost and

Performance. 6th IEEE Int. Conf. on e-Science.

Deelman, E., Singh, G., Livny, M., Berriman, B. and

Good, J. (2008). The cost of doing science on the

cloud: the Montage example. In SC '08: Proceedings

of the 2008 ACM/IEEE conference on

Supercomputing, Oregon, USA, November 2008.

Garfinkel, S. (2007). Commodity Grid Computing with

Amazon S3 and EC2. In login 2007.

Greenberg, A., Hamilton, J., Maltz, D. and Patel, P.

(2009). The Cost of a Cloud: Research Problems in

Data Center Networks. ACM SIGCOMM Computer

Communication Review, 39, 1.

Grimme, C., Lepping, J. and Papaspyrou, A. (2008).

Prospects of Collaboration between Compute

Providers by means of Job Interchange. In

Proceedings of the 13th Job Scheduling Strategies for

Parallel Processing, April 2008, Lecture Notes in

Computer Science (LNCS), 4942.

Hamdaqa, M., Liviogiannis, L. and Tavildari, L. (2011): A

Reference Model for Devloping Cloud Applications.

Int. Conf. on Cloud Computing and Service Science

(CLOSER) 2011.

Hoff, T. (2009). Cloud Programming Directly Feeds Cost

Allocation Back into Software Design. Blog on

HighScalability.com, March 6, 2009.

Käfer, G. (2010a): Cloud Computing Architecture. SEI

Architecture Technology User Network Conf

(SATURN) 2010. http://www.sei.cmu.edu/library/

assets/presentations /Cloud Computing Architecture -

Gerald Kaefer.pdf

Käfer, G. (2010b): Cloud Computing Architecture – How

to reconcile business, technical, and legal

requirements. CloudConf 2010. http://cdn1.hlmc.de/

tl_files/cloudconf/Downloads/Downloads 17.11.2010 /

Cloud Computing Architektur.pdf

Khajeh-Hosseini, A., Sommerville, I. and Sriram, I.

(2011). Research Challenges for Enterprise Cloud

Computing. 1st ACM Symposium on Cloud

Computing, SOCC 2010, Indianapolis.

Klems, M., Nimis, J. and Tai, S. (2009). Do Clouds

Compute? A Framework for Estimating the Value of

Cloud Computing. Designing E-Business Systems.

Markets, Services, and Networks, Lecture Notes in

Business Information Processing, 22.

Kondo, D., Javadi, B., Malecot, P., Cappello, F. and

Anderson, D. P. (2009). Cost-benefit analysis of Cloud

Computing versus desktop grids. In Proc. of the 2009

IEEE international Symp. on Parallel&Distributed

Processing, May 2009.

Kossmann, D., Kraska, T. and Loesing, S. (2010). An

Evaluation of Alternative Architectures for Trans-

action Processing in the Cloud. ACM SIGMOD 2010

Kruchten, P. (1995). Architectural Blueprints – The

“4+1” View Model of Software Architecture. IEEE

Software 12 (6), November 1995.

Microsoft Extreme Computing Group (2011): All Azure

Benchmark Test Cases. Website: http://azurescope

.cloudapp.net/BenchmarkTestCases/.

Microsoft (2011). Best Practices for Developing on

Windows Azure. http://azurescope.cloudapp.net/

BestPractices.

Pace, E., Betts, D., Densmore, S., Dunn, R., Narumoto,

M., and Woloski M. (2010). Moving Applications to

the Cloud on the Microsoft Azure™ Platform.

Microsoft Press, August 2010.

Palankar, M., Iamnitchi, A., Ripeanu, M. and Garfinkel, S.

(2008). Amazon S3 for Sciene Grids: A Viable

Solution? In: Data-Aware Distributed Computing

Workship (DADC), 2008.

Varia, J. (2010). Architecting for the Cloud: Best

Practices. Amazon Web Services, January 2010-2011.

Walker, E. (2009). The Real Cost of a CPU Hour.

Computer, 42, 4.

Youseff, L., Butrico, M. and Da Silva, D. (2008). Toward

a Unified Ontology of Cloud Computing. In Grid

Computing Environments Workshop (GCE '08),

Austin, Texas, USA, November 2008.

CLOSER2012-2ndInternationalConferenceonCloudComputingandServicesScience

344