FROM QoD TO QoS

Data Quality Issues in Cloud Computing

Przemyslaw Pawluk, Marin Litoiu and Nick Cercone

York University, Toronto, ON, Canada

Keywords:

Cloud computing, Quality of data (QoD), Quality of service (QoS), Quality of control (QoC).

Abstract:

The concept of Quality of Data (QoD) has so far been neglected in the context of cloud computing. It was;

however explored for the long time in the context of data exchange, data integration and information systems.

Well established approaches like Total Data Quality Management, Data Warehouse Quality or Data Quality

in Cooperative Information Systems have been proposed to calculate, store and maintain information about

QoD. On the other hand concept of Quality of Service has been investigated in the context of Internet Systems,

multimedia transmission and enterprise systems. It was also investigated in connection to cloud computing.

The main goal of this work is to show direct connection between QoD and QoS. We show that assuring high

QoD is necessary to achieve high QoS. We also identify major shortcomings of public cloud vendors in terms

of provided conﬁguration management data.

1 INTRODUCTION

Cloud computing refers to computation, software,

data access, and storage services that do not re-

quire end-user knowledge of the physical location

and conﬁguration of the system that delivers the ser-

vices. Cloud computing is a natural evolution of

the widespread adoption of virtualization, service-

oriented architecture, autonomic and utility comput-

ing (Vouk, 2008; Lim et al., 2009). Details are ab-

stracted from end-users, who no longer have need for

expertise in, or control over, the technology infras-

tructure “in the Cloud” that supports them. It does

involve; however, certain level of control over virtual

instances. This control requires high quality informa-

tion about the system state. Virtual computing ser-

vices becoming attractive for several reasons includ-

ing adaptability, dynamic behavior and price. The

Cloud computing leads to several research problems

that have been of special interest. Here we will dis-

cuss provenance, which is analyzing history of data,

trusted computing and automation of the Cloud con-

trol.

The Conﬁguration Management Database

(CMDB) provides a common trusted source for all

IT data used by the business and promises to improve

IT operational efﬁciency and increase alignment

between the business and IT while reducing costs

(EMA, 2008). The CMDB can be used also do

support Cloud management.

Data quality problems occur along the entire data

processing continuum. Data preparation is crucial and

consists of several necessary operations such as clean-

ing data, normalizing, handling noisy, uncertain or

untrustworthy information, handling missing values,

transforming and coding data in such a way that it

becomes suitable for the data mining process. Those

methods are based on statistics and heuristics. The

concept and importance of quality of data has been

discussed many times in the literature (Ballou and

Pazer, 1985; Batini and Scannapieco, 2006; Tupek,

2006; Wang and Strong, 1996) usually in context of

the single data source. However, some research has

been also done in the context of integrated data em-

phasizing the importance of data quality assurance

in this context (Gertz and Schmitt, 1998; Naumann,

2002; Reddy and Wang, 1995). Data quality has been

also considered in the context of data mining (Berti-

Equille, 2007; Dasu and Johnson, 2003). As pointed

by Beti-

Equille (Berti-Equille and Moussouni, 2005)

validity of results interpretation strongly relies on the

data preparation process and on the quality of data set

being analyzed. This is because methods such as data

mining assume certain properties of data e.g. “nice”

distribution.

Data quality problem has so far been neglected

in the context of Cloud computing. Authors are not

aware of any extensive work on this subject; however

697

Pawluk P., Litoiu M. and Cercone N..

FROM QoD TO QoS - Data Quality Issues in Cloud Computing.

DOI: 10.5220/0003558606970702

In Proceedings of the 1st International Conference on Cloud Computing and Services Science (IDQ-2011), pages 697-702

ISBN: 978-989-8425-52-2

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

some discussion have been found on Internet forums

and blogs (Harzog, 2010; Vambenepe, 2010; Row,

2010). In those discussions some important aspects

of data quality have been pointed.

In this work we discuss several data quality con-

cerns and issues identiﬁed in the context of Cloud

computing and CMDB. The remaining of this work

has a following structure: Section 2 presents overview

of research on Quality of Data, Section 3 describes the

concept of Cloud Computing. In the Section 4 a link

between QoD and QoS is presented. In the Section 5

current situation in public Clouds.

2 WHAT IS DATA QUALITY?

There can be found many different deﬁnitions of

Quality of Data (data quality, QoD) in the literature.

Researchers do not agree on one common deﬁnition

of QoD and provide many essentially different deﬁni-

tions. This lack of common deﬁnition leads to deﬁn-

ing QoD by providing dimensions – some better de-

ﬁned metrics that enables us to measure and com-

pare some features of data sets. However this deﬁ-

nition by deﬁnition failed since same dimension may

be understand in a different way or same feature may

be called differently by two researchers. This prob-

lem has been noticed by Wang and Strong (Wang and

Strong, 1996). Commonly used deﬁnition (Tayi and

Ballou, 1998; Wang and Strong, 1996; Orr, 1998) de-

ﬁnes quality as “ﬁtness for use”. It implies the rela-

tive nature of the quality concept. As stated in (Orr,

1998) understanding of quality depends strongly on

how users actually use the data in the system, since

they are ultimate judges of the quality. There is no

common standard of QoD, however we can ﬁnd an

ISO document ISO8402:1995 Quality Management

and Quality Assurance Vocabulary (ISO, 1994) or

its newer version ISO9000:2005 Quality management

systems – Fundamentals and vocabulary (ISO, 2005).

It provides a formal deﬁnition of quality as: “The

totality of characteristics of an entity that bear on

its ability to satisfy stated and implied needs” (ISO,

1994). It is clear that the main authority in terms of

QoD is the user and his requirements are main guide-

lines to deﬁne and measure QoD.

2.1 Data Quality Dimensions

Data quality is deﬁned often through quality dimen-

sions (called sometimes quality factors). We will use

words dimension and factor interchangeably in the re-

mainder of this work. There is over hundred different

dimensions identiﬁed in some publications (Wang and

Strong, 1996). We do not discuss all QoD factors in

this work. We rather concentrate on those factors that

can be applied into the context of Quality of Service.

A comprehensive discussion of different quality di-

mensions can be found in (Wang and Strong, 1996)

and (Batini and Scannapieco, 2006).

2.1.1 Data Decay – Time Related Factors

There is a subset of quality factors directly corre-

lated with time. This characteristic is intuitive and

we can easily point such factors. The only problem

is that there are different meanings of time-related

terms proposed in the literature. For example Nau-

mann (Naumann, 2002) deﬁnes timeliness as the aver-

age age of data in source. Timeliness in other sources

“refers to the length of time between the reference pe-

riod of the information and when we deliver the data

product to our customers” (Tupek, 2006).

Segev (Segev and Fang, 1990) deﬁnes currency as

the time interval between extraction and delivery. The

currency in this form has been named timeliness by

Wang (Wang and Strong, 1996). This deﬁnition in

our opinion is best ﬁtted for the Cloud systems. It

may seen as a delay between consecutive readings of

a Cloud state.

2.1.2 Accuracy

Accuracy is included by most data quality studies as

a key factor (Parssian et al., 2002; Batini and Scan-

napieco, 2006; Wang et al., 2005; Ballou and Pazer,

1985; Gertz and Schmitt, 1998). Although the term

has an intuitive appeal, there is no commonly ac-

cepted deﬁnition of what it means exactly (Wand and

Wang, 1996). Ballou and Pazer (Ballou and Pazer,

1985) describe accuracy as “the recorded value is in

conformity with the actual value.” Kriebel (Kriebel

and Moore, 1982) characterizes accuracy as “the cor-

rectness of the output information.” Thus, accuracy in

this case appears as the term viewed as equivalent to

correctness.

In (Batini and Scannapieco, 2006) accuracy is de-

ﬁned as “the closeness between a value v and a value

′

, considered as the correct representation of the real-

life phenomenon that v aims to represent.” The sim-

ple example can be the name of the city

′

Toronto

′

the value v =

′

Tronto

′

is incorrect (inaccurate) and

′

Toronto

′

is correct (accurate).

Accuracy can be also seen as an “error bar”. In

other words an error of the measurement. In case of

Cloud computing, managing is based on aggregated

values of readings. Accuracy can be expressed in such

case as a standard deviation in the sample. It is only a

suggestion and this problem requires further investi-

CLOSER 2011 - International Conference on Cloud Computing and Services Science

698

gation. We do not discuss this issue in this paper.

2.1.3 Completeness

Term deﬁned by Naumann (Naumann, 2002) is co-

inciding with nullability and is “the quotient of the

number of non-null values in a source and the size of

the universal relation.” It means that the less null val-

ues in the relation, the higher value of completeness

is (more complete is the relation).

In (Wang and Strong, 1996) the completeness is

deﬁned as “the extent to which data are of sufﬁcient

breadth, depth, and scope for the task at hand.” Bo-

browski (Bobrowski et al., 1998) claims that it ex-

presses that every fact of the real world is represented

in the information system.

The problem of completeness deﬁnition in the

context of Cloud system requires further investiga-

tion. We have to deﬁne, what does it mean that our

system representation (variable set) is complete.

3 CLOUD COMPUTING

Cloud computing refers to computation, software,

data access, and storage services that do not require

end-user knowledge of the physical location and con-

ﬁguration of the system that delivers the services.

Cloud computing is a next step in the evolution of

the widespread adoption of virtualization, service-

oriented architecture, autonomic and utility comput-

ing (Vouk, 2008; Lim et al., 2009). Details are ab-

stracted from end-users, who no longer have need for

expertise in, or control over, the technology infras-

tructure “in the Cloud” that supports them. Virtual

computing services becoming attractive for several

reasons including adaptability, dynamic behavior and

price. The Cloud computing leads to several research

problems that have been of special interest. Here we

will discuss different dimensions of QoD.

Cloud computing uses remote virtual servers for

storage and all processing of data. Data quality, there-

fore, becomes one of primary requirements and ad-

ministrators should address this aspect before decid-

ing on a Cloud computing vendor. Lately, only Qual-

ity of Service (QoS) was considered for Clouds. QoS

refers to a broad collection of networking technolo-

gies and techniques. The goal of QoS is to provide

guarantees on the ability of a network to deliver pre-

dictable results. Elements of network performance

within the scope of QoS often include availability (up-

time), bandwidth (throughput), latency (delay), and

error rate. Clouds fall into one of following three

types of system(Vaquero et al., 2008):

• Software as a Service that is deﬁned as a provider

supplying remotely run software packages on a

utility based pricing model. (e.g. online text edi-

tors or spread sheets)

• Platform as a Service that is deﬁned as a provider

offering additional layer of abstraction above the

virtual infrastructure. PaaS offers built-in scala-

bility traded off by some restrictions of software

that can be deployed.

• Infrastructure as a Service that is deﬁned as a

provider provisioning compute and storage re-

source capacity through virtualization. IaaS al-

lows physical resources to be assigned and split

in dynamic manner.

Three types of Cloud systems form layers in a sense

that higher layer can deploy and utilize the lower level

features (Armstrong and Djemame, 2009).

4 QoD AND QoS

Quality of data can be considered on two distinct lev-

els in Cloud computing environment. The ﬁrst level,

probably most obvious, is quality of data deployed

into the Cloud (customers data). It has to be pro-

cessed in consistent way. Data quality indicates the

degree of excellence within the data, its state of com-

pleteness, validity, and accuracy that enables it to per-

form further functions. This in turn, enables the user

to obtain the necessary information required for op-

erational reasons or to assist in decision making and

planning. Data of high quality produces results that

need to be reliable and correct. In essence, if you

choose Cloud computing, data quality needs to be ac-

curate and in reliable formats. Ideally the Cloud in-

frastructure should not interfere with data on this level

and, if it is not explicitly required, leave quality assur-

ance to customers.

The other level of quality in Cloud is quality of in-

ternal data such as conﬁguration management records

(CMR) or simply measurements of resources usage.

It is perceived as meta-data describing the Cloud sys-

tem. Working on that level, the goal is to assure high

QoS. We will see that it can be done through high

QoD. Let us consider now how QoD and QoS inter-

fere in the context of Cloud computing. To do that we

have to analyze important aspects of QoS, techniques

allowing to achieve them and identify how QoD im-

pacts those techniques.

FROM QoD TO QoS - Data Quality Issues in Cloud Computing

699

4.1 Dimensions of QoS

In characterizing the QoS of activities, it is necessary

to identify dimensions along which QoS can be mea-

sured and quantiﬁed. In this work we consider QoS

from the perspective of a service provider. The mean-

ing of enlisted dimensions may change when consider

from other actors (e.g. end-user) point of view. It is

useful to group sets of QoS dimensions into QoS cate-

gories where each category contains dimensions per-

taining to some logically identiﬁable aspect of QoS.

Campbell (Campbell, 1996) distinguished following

categories:

• system reliability – contains system-related relia-

bility dimensions (e.g. MTBF, MTTR)

• timeliness – contains dimensions relating to the

end-to-end delay of data ﬂow

• volume – contains dimensions that refer to the

throughput of data in a ﬂow

• criticality – relates to the assignment of relative

priority levels between activities

• quality of perception – is concerned with dimen-

sions such as screen resolution or sound quality

• cost – understand as a fee paid by a service

provider to the Cloud vendor

Let’s now take a closer look at those categories

and see if they depend upon data quality. In some

cases such dependency seems to be obvious in other

cases it requires deeper investigation and is not visible

at ﬁrst glance.

4.1.1 Timeliness

Timeliness category contains dimensions relating to

the end-to-end delay of data ﬂow. Such delay de-

pends upon several aspects of the system deployed in

the Cloud. One of those factors is number of run-

ning instances. This number can be changed based

on information about system load and resources usage

provided by Cloud vendor. If such information is not

fresh (up-to-date) it is impossible to take just-in-time

decisions. Delayed decisions can easily lead to lower

QoS. In particular the number of active instances may

not be sufﬁcient to meet assumed response time and

to satisfy end-users’ expectations.

4.1.2 Cost

Cost category refers to the cost of processing and us-

age. Cost in public Cloud depends, among many

other factors, directly on the number of active in-

stances. This number can be decreased as load is

lower. Clearly, freshness (currency, timeliness) of

information about load provided by Cloud vendor

strongly impacts the ability to undertake valid deci-

sions.

Checking Amazon EC2 pricing we can see there

that a price for default Windows instance is $0.12 per

hour. Default policy says also that “pricing is per

instance-hour consumed for each instance, from the

time an instance is launched until it is terminated.

Each partial instance-hour consumed will be billed

as a full hour” (Amazon, 2011a). Amazon EC2 of-

fers monitoring with a time window of 30 minutes.

How does it inﬂuence cost of computing? Let’s an-

alyze following case. An application is deployed on

Amazon EC2 public Cloud and uses up to three in-

stances. Additional instances are launched when re-

quest rate reaches following thresholds: 4,000 for the

second instance and 7,000 for the second instance. In-

stances are terminated when request rate is lower than

6,000 and 3,000 respectively for third and second in-

stance. If such situation repeats every day it means

we loose 30 ∗ $0.12 = $3.6 every month, or about

$44 each year. This calculation is done for the small-

est instance offered by Amazon EC2 and cost can be

higher depending on the conﬁguration (for high-CPU

on-demand instances this cost can be as high as $434

per year in the same scenario).

On the other hand, there is a certain cost of each

message (measurement) sent. In case of Amazon EC2

it is $0.008 per message. It does not seem to be

much; however measuring only one variable hourly

gives us about $70 per year. Increasing the frequency

and measuring every thirty minutes gives $140 yearly.

In such case there is a trade-off that needs to be made

based on dynamism of the system.

There are of course certain methods to act proac-

tively in such situation. One can analyze trend to pre-

dict the time when certain instance should be termi-

nated, however even for those methods up-to-date in-

formation is necessary for accurate predictions. The

accuracy impacts then cost and/or performance of the

system.

4.1.3 Other Dimension Category

System Reliability category contains system-related

reliability dimensions such that Min Time Between

Failures (MTBF) or Min Time To Recovery (MTTR).

Volume category contains dimensions that refer to the

throughput of data in a ﬂow. Criticality category re-

lates to the assignment of relative priority levels be-

tween activities. Quality of Perception is a category

that is concerned with dimensions such as screen res-

olution or sound quality and refers to user perception.

More detailed discussion of QoS dimension can be

found in ISO/EIC “Information technology – Quality

CLOSER 2011 - International Conference on Cloud Computing and Services Science

700

of service: Framework” standard (ISO/IEC, 1998).

4.2 Why it is Important?

We have shown that certain dimensions of quality of

data has signiﬁcant impact over some dimensions of

quality of service. At this point we would like to sum-

marize our point of view.

We have shown examples of inﬂuence of certain

dimensions of QoD such as freshness and accuracy

on some dimensions of QoS. We claim that the inter-

connection between QoD and QoS can be utilized to

improve Quality of Control (QoC) (Marti et al., 2002).

This metric allows us to measure how good and how

fast the system can react on certain events. For exam-

ple, how fast new instance can be started to handle ex-

cessive number of request. Because of lack of space,

we do not discuss this concept in detail.

5 HOW DOES IT WORK IN

PUBLIC CLOUDS?

In this section we present current landscape of the

public Cloud market. Our goal is to show what is

currently provided by Cloud vendors. There are four

main commercial providers of Cloud services on the

market.

Amazon was the ﬁrst company supplying Cloud

infrastructure early in 2006. Amazon Web Service

(Amazon, 2011b) provides PaaS on pay per use ba-

sis. They provide two products the Amazon Elastic

Compute Cloud (EC2) and the Amazon Simple Stor-

age Service (Amazon S3). Amazon provides also set

of API’s. In pay-per-use model Amazon is charging

per the time the instance is active. Additional cost ap-

plies for messages (state of the system), storage etc.

Another provider of Cloud services is Google.

Google provides SaaS through Google Apps (Google,

2011b) software and an PaaS via Google App En-

gine(Google, 2011a). The Google App Engine pro-

vides the architecture that Google Apps runs on. They

also use pay per use economical model charging ser-

vice provider per application and per user.

IBM provides PaaS based on API’s created by

Amazon. It is known as IBM’s Research Compute

Cloud(IBM, 2011a). IBM provides also IBM Com-

puting on Demand (IBM, 2011b) that are addressed to

supply enterprise Cloud Computing. IBM uses eco-

nomical model similar to the model used by Amazon

and charges per hour of usage. Prices varies depend-

ing on operating system and virtual server conﬁgura-

tion. Microsoft is not providing Cloud services. The

company is, however, developing the Azure Service

Platform(Azure, 2011). Azure is PaaS operating sys-

tem that incorporates many Microsoft’s packages. It

can be utilized by licensed Cloud vendors as all-in-

one Cloud software solution. In this case charges are

also calculate per hour using pay-per-use model.

6 CONCLUSIONS AND FUTURE

WORK

In this work we pointed important data quality issues

arising in the area of Cloud computing and their ef-

fects for certain dimensions of QoS. This correlation

of QoS and QoD requires deeper investigation. It is

clear; however that data quality assurance is neces-

sary to achieve high quality of service.

We have shown here the intuitive examples of

correlation between different dimensions of QoS and

QoD such as cost (QoS) and freshness (QoD), time-

liness (QoS) and freshness (QoD). We are going to

investigate this issue in depth.

This work shows the interconnection between

QoS and QoD dimensions in informal way and does

not provide quantitative methods of assessment. Our

future work will be concentrated on formalizing this

connection and providing quantitative methods of its

assessment. We want to map data quality dimensions

into quality of service dimensions and design func-

tions modeling those mappings in mathematical way.

Our long term goal is to provide a model combin-

ing quality of data and quality of service, and improv-

ing at the same time quality of control by enabling

just-in-time decisions and reducing settling time. To

achieve this goal, we are going to develop a new,

quality-aware type of autonomic manager. This can

be achieved by development of quality-aware sensors

and effectors.

Our ﬁrst step will be experimental evaluation of

sensors adjusting the measurement interval dynami-

cally depending on the change rate of the measured

value. Intuitively, rapidly changing values should re-

quires more frequent measurement. Dynamic adjust-

ment of the interval is expected to optimize certain di-

mensions of QoS and the cost (or overhead generated

by frequent measurement) at the same time.

REFERENCES

Amazon (2011a). Amazon ec2. http://aws.amazon.com/ec

2/.

Amazon (2011b). Amazon web service. http://aws.amazon.

com.

FROM QoD TO QoS - Data Quality Issues in Cloud Computing

701

Armstrong, D. and Djemame, K. (2009). Towards quality

of service in the cloud. In Proceedings of 25th UK

Performance Engineering Workshop.

Azure (2011). Azure. http://www.microsoft.com/azure.

Ballou, D. and Pazer, H. (1985). Modeling data and pro-

cess quality in multi-input, multi-output information

systems. Management Science, 31(2):150–162.

Batini, C. and Scannapieco, M. (2006). Data Qual-

ity: Concepts, Methodologies and Techniques. Data-

Centric Systems and Applications. Springer-Verlag

New York, Inc., Secaucus, NJ, USA.

Berti-

Equille, L. (2007). Data quality awareness: a case

study for cost optimal association rule mining. Knowl.

Inf. Syst., 11:191–215.

Berti-Equille, L. and Moussouni, F. (2005). Quality-Aware

Integration and Warehousing of Genomic Data. In

Proceedings of the 2005 International Conference on

Information Quality.

Bobrowski, M., Marr, M., and Yankelevich, D. (1998). A

software engineering view of data quality. In Euro-

pean Quality Week Conference.

Campbell, A. T. (1996). A Quality of Service Architecture.

PhD thesis, Lancaster University.

Dasu, T. and Johnson, T. (2003). Exploratory Data Mining

and Data Cleaning. Wiley-Interscience.

EMA (2008). How to deﬁne detailed requirements for your

enterprise cmdb project: A hands-on workbook.

Gertz, M. and Schmitt, I. (1998). Data Integration Tech-

niques based on Data Quality Aspects. In Schmitt,

I., T

urker, C., Hildebrandt, E., and H

oding, M., ed-

itors, Proceedings 3. Workshop “F

oderierte Daten-

banken”, Magdeburg, 10./11. Dezember 1998, pages

1–19. Shaker Verlag, Aachen.

Google (2011a). Google app engine. http://code.google.

com/appengine.

Google (2011b). Google apps. http://www.google.com/

apps/business.

Harzog, B. (2010). Is the cmdb irrelevant in a vir-

tual and cloud based world? Blog entry:

http://www.virtualizationpractice.com/blog/?p=5726.

IBM (2011a). Ibm cloud computing. http://www-

935.ibm.com/services/us/cloud/index.html.

IBM (2011b). Ibm computing on demand. http://www-

03.ibm.com/systems/deepcomputing/cod/.

ISO (1994). ISO 8402 Quality Management and Quality

Assurance: Vocabulary. ISO. Withdrawn standard.

ISO (2005). ISO 9000:2005 Quality management systems –

Fundamentals and vocabulary. ISO. Published stan-

dard.

ISO/IEC (1998). ISO/IEC 13236:1998. Information tech-

nology – Quality of service: Framework. ISO/IEC.

Kriebel, C. H. and Moore, J. H. (1982). Economics and

management information systems. SIGMIS Database,

14(1):30–40.

Lim, H. C., Babu, S., Chase, J. S., and Parekh, S. S.

(2009). Automated control in cloud computing: chal-

lenges and opportunities. In Proceedings of the 1st

workshop on Automated control for datacenters and

clouds, ACDC ’09, pages 13–18, New York, NY,

USA. ACM.

Marti, P., Fuertes, J. M., and Fohler, G. (2002). Improving

quality-of-control using ﬂexible timing constraints:

Metric and scheduling issues. In In IEEE RTSS.

Naumann, F. (2002). Quality-driven query answering for

integrated information systems. Springer-Verlag New

York, Inc., New York, NY, USA.

Orr, K. (1998). Data quality and system theory. Commun.

ACM, 41(2):66–71.

Parssian, A., Sarkar, S., and Jacob, V. S. (2002). Assessing

information quality for the composite relational oper-

ation join. In IQ, pages 225–237.

Reddy, M. P. and Wang, R. Y. (1995). Estimating data ac-

curacy in a federated database environment. In CIS-

MOD, pages 115–134.

Row, J. R. (2010). All about cloud computing and data

quality. http://www.brighthub.com.

Segev, A. and Fang, W. (1990). Currency-based updates to

distributed materialized views. In Proceedings of the

Sixth International Conference on Data Engineering,

pages 512–520, Washington, DC, USA. IEEE Com-

puter Society.

Tayi, G. K. and Ballou, D. P. (1998). Examining data qual-

ity. Commun. ACM, 41(2):54–57.

Tupek, A. R. (2006). Deﬁnition of data quality.

Vambenepe, W. (2010). Cmdb in the cloud:

not your fathers cmdb. Blog entry:

http://stage.vambenepe.com/archives/1527.

Vaquero, L. M., Rodero-Merino, L., Caceres, J., and Lind-

ner, M. (2008). A break in the clouds: towards a

cloud deﬁnition. SIGCOMM Comput. Commun. Rev.,

39:50–55.

Vouk, M. A. (2008). Cloud computing issues, research and

implementations. ITI 2008 30th International Confer-

ence on Information Technology Interfaces, 16(4):31–

40.

Wand, Y. and Wang, R. Y. (1996). Anchoring data qual-

ity dimensions in ontological foundations. Commun.

ACM, 39(11):86–95.

Wang, R. Y., Pierce, E. M., and Madnick, S. E. (2005). In-

formation quality, volume 1 of Advances in manage-

ment information systems: Information Quality. M.E.

Sharpe.

Wang, R. Y. and Strong, D. M. (1996). Beyond accuracy:

what data quality means to data consumers. J. Man-

age. Inf. Syst., 12(4):5–33.

CLOSER 2011 - International Conference on Cloud Computing and Services Science

702