Authors:
Sakil Barbhuiya
1
;
Zafeirios Papazachos
2
;
Peter Kilpatrick
2
and
Dimitrios S. Nikolopoulos
2
Affiliations:
1
Queen's University of Belfast, United Kingdom
;
2
Queen’s University of Belfast, United Kingdom
Keyword(s):
Anomaly Detection, Cloud Computing, Data Centres, Monitoring, Correlation.
Related
Ontology
Subjects/Areas/Topics:
Big Data Cloud Services
;
Cloud Application Architectures
;
Cloud Application Scalability and Availability
;
Cloud Applications Performance and Monitoring
;
Cloud Computing
;
Platforms and Applications
Abstract:
Cloud data centres are critical business infrastructures and the fastest growing service providers. Detecting
anomalies in Cloud data centre operation is vital. Given the vast complexity of the data centre system software
stack, applications and workloads, anomaly detection is a challenging endeavour. Current tools for detecting
anomalies often use machine learning techniques, application instance behaviours or system metrics distribution,
which are complex to implement in Cloud computing environments as they require training, access to
application-level data and complex processing. This paper presents LADT, a lightweight anomaly detection
tool for Cloud data centres that uses rigorous correlation of system metrics, implemented by an efficient correlation
algorithm without need for training or complex infrastructure set up. LADT is based on the hypothesis
that, in an anomaly-free system, metrics from data centre host nodes and virtual machines (VMs) are strongly
correlated. An anomal
y is detected whenever correlation drops below a threshold value. We demonstrate and
evaluate LADT using a Cloud environment, where it shows that the hosting node I/O operations per second
(IOPS) are strongly correlated with the aggregated virtual machine IOPS, but this correlation vanishes when
an application stresses the disk, indicating a node-level anomaly.
(More)