Practical Aspects for Effective Monitoring of SLAs in Cloud Computing and Virtual Platforms

Ali Imran Jehangiri, Edwin Yaqub, Ramin Yahyapour


Cloud computing is transforming the software landscape. Software services are increasingly designed in modular and decoupled fashion that communicate over a network and are deployed on the Cloud. Cloud offers three service models namely Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), and Softwareas- a-Service (SaaS). Although this allows better management of resources, the Quality of Service (QoS) in dynamically changing environments like Cloud must be legally stipulated as a Service Level Agreement (SLA). This introduces several challenges in the area of SLA enforcement. A key problem is detecting the root cause of performance problems which may lie in hosted service or deployment platforms (PaaS or IaaS), and adjusting resources accordingly. Monitoring and Analytic methods have emerged as promising and inevitable solutions in this context, but require precise real time monitoring data. Towards this goal, we assess practical aspects for effective monitoring of SLA-aware services hosted in Cloud. We present two real-world application scenarios for deriving requirements and present the prototype of ourMonitoring and Analytics framework. We claim that this work provides necessary foundations for researching SLA-aware root cause analysis algorithms under realistic setup.


  1. Aceto, G., Botta, A., Donato, W. D., and Pescapè, A. (2012). Cloud Monitoring: definitions, issues and future directions. In IEEE CLOUDNET 2012.
  2. Agarwal, M., Appleby, K., Gupta, M., and Kar, G. (2004). Problem determination using dependency graphs and run-time behavior models. Utility Computing, pages 171-182.
  3. Appleby, K., Goldszmidt, G., and Steinder, M. (2001). Yemanja-a layered event correlation engine for multidomain server farms. In Integrated Network Management Proceedings, 2001 IEEE/IFIP International Symposium on, volume 00, pages 329-344. IEEE.
  4. Barham, P., Isaacs, R., and Mortier, R. (2003). Magpie: Online modelling and performance-aware systems. In In Proceedings of the Ninth Workshop on Hot Topics in Operating Systems.
  5. Chen, M., Kiciman, E., and Fratkin, E. (2002). Pinpoint: Problem determination in large, dynamic internet services. In In Proc. 2002 Intl. Conf. on Dependable Systems and Networks.
  6. Ejarque, J., Fitó, J. O., Katsaros, G., Luis, J., and Martinez, P. (2011). OPTIMIS Deliverable Requirements Analysis ( M16 ). Technical report, NTUA, ATOS, SCAI, SAP, BT, CITY, LUH, 451G, FLEXIANT, ULEEDS.
  7. ESPER (2013). Home page of esper. http://esper. [Online; accessed 26- March-2013].
  8. Gruschke, B. and Others (1998). Integrated event management: Event correlation using dependency graphs. In Proceedings of the 9th IFIP/IEEE International Workshop on Distributed Systems: Operations & Management (DSOM 98), pages 130-141.
  9. Hanemann, A. (2007). Automated IT Service Fault Diagnosis Based on Event Correlation Techniques. PhD thesis.
  10. Hasselmeyer, P. and D'Heureuse, N. (2010). Towards holistic multi-tenant monitoring for virtual data centers. 2010 IEEE/IFIP Network Operations and Management Symposium Workshops, pages 350-356.
  11. Hoke, E., Sun, J., Strunk, J., and Ganger, G. (2006). InteMon: continuous mining of sensor data in large-scale self-infrastructures. ACM SIGOPS Operating Systems Review, 40(3):38-44.
  12. Jeune, G. L., García, E., Peribán˜ez, J. M., and Mun˜oz, H. (2012). 4CaaSt Scientific and Technical Report D5.1.1. Technical report, Seventh Framework Programme.
  13. Kang, H., Chen, H., and Jiang, G. (2010). PeerWatch: a fault detection and diagnosis tool for virtualized consolidation systems. In Proceedings of the 7th international conference on Autonomic computing, pages 119-128.
  14. Kang, H., Zhu, X., and Wong, J. (2012). DAPA: diagnosing application performance anomalies for virtualized infrastructures. 2nd USENIX workshop on Hot-ICE.
  15. Katsaros, G., Kübert, R., and Gallizo, G. (2011). Building a Service-Oriented Monitoring Framework with REST and Nagios. 2011 IEEE International Conference on Services Computing, 567:426-431.
  16. Massie, M. (2004). The ganglia distributed monitoring system: design, implementation, and experience. Parallel Computing, 30(7):817-840.
  17. Molenkamp, G. (2002). Diagnosing quality of service faults in distributed applications. Performance, Computing, and Communications Conference, 2002. 21st IEEE International.
  18. Nagios (2013). Home page of nagios. http:// [Online; accessed 26-March-2013].
  19. O'Hara, R. B. and Sillanpää, M. J. (2009). A review of Bayesian variable selection methods: what, how and which. Bayesian Analysis, 4(1):85-117.
  20. OpenShift (2013). Home page of openshift. https:// [Online; accessed 26-March2013].
  21. OpenStack (2013). Home page of openstack. http:// [Online; accessed 26-March2013].
  22. OpenTSDB (2013). Home page of opentsdb. http:// [Online; accessed 26-March-2013].
  23. OpenView (2013). Hp openview - wikipedia, the free encyclopedia. title=HP OpenView&oldid=547020972. [Online; accessed 26-March-2013].
  24. Rak, M., Venticinque, S., Mhr, T., Echevarria, G., and Esnal, G. (2011). Cloud Application Monitoring: The mOSAIC Approach. 2011 IEEE Third International Conference on Cloud Computing Technology and Science, pages 758-763.
  25. Sharma, B., Jayachandran, P., Verma, A., and Das, C. (2012). CloudPD: Problem Determination and Diagnosis in Shared Dynamic Clouds., pages 1-30.
  26. Stratan, I. L., Newman, H., Voicu, R., Cirstoiu, C., Grigoras, C., Dobre, C., Muraru, A., Costan, A., Dediu, M., and C. (2009). MONALISA: An Agent based , Dynamic Service System to Monitor , Control and Optimize Grid based Applications The Distributed Services. Computer Physics Communications, 180:2472- 2498.
  27. Tan, Y., Nguyen, H., and Shen, Z. (2012). PREPARE: Predictive Performance Anomaly Prevention for Virtualized Cloud Systems. In Distributed Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on, number Vcl.
  28. Tivoli (2013). Home page of ibm tivoli. http:// [Online; accessed 26-March-2013].
  29. Yaqub, E., Wieder, P., Kotsokalis, C., Mazza, V., Pasquale, L., Rueda, J. L., Gómez, S. G., and Chimeno, A. E. (2011). A generic platform for conducting sla negotiations. In Service Level Agreements for Cloud Computing, pages 187-206. Springer.
  30. Yaqub, E., Yahyapour, R., Wieder, P., and Lu, K. (2012). A protocol development framework for sla negotiations in cloud and service computing. In Service Level Agreements for Cloud Computing, pages 1-15. Springer.

Paper Citation

in Harvard Style

Imran Jehangiri A., Yaqub E. and Yahyapour R. (2013). Practical Aspects for Effective Monitoring of SLAs in Cloud Computing and Virtual Platforms . In Proceedings of the 3rd International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-8565-52-5, pages 447-454. DOI: 10.5220/0004507504470454

in Bibtex Style

author={Ali Imran Jehangiri and Edwin Yaqub and Ramin Yahyapour},
title={Practical Aspects for Effective Monitoring of SLAs in Cloud Computing and Virtual Platforms},
booktitle={Proceedings of the 3rd International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},

in EndNote Style

JO - Proceedings of the 3rd International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - Practical Aspects for Effective Monitoring of SLAs in Cloud Computing and Virtual Platforms
SN - 978-989-8565-52-5
AU - Imran Jehangiri A.
AU - Yaqub E.
AU - Yahyapour R.
PY - 2013
SP - 447
EP - 454
DO - 10.5220/0004507504470454