tency. In SOSP’11, 23rd ACM Symposium on Operat-
ing System Principles. ACM.
Castro, M., Druschel, P., Kermarrec, A.-M., and Rowstron,
A. (2002). Scribe: A large-scale and decentralized
application-level multicast infrastructure. IEEE Jour-
nal on Selected Areas in Communications (JSAC).
Cranor, C., Johnson, T., and Spataschek, O. (2003). Gi-
gascope: a stream database for network applications.
In SIGMOD’03, 2003 ACM SIGMOD International
Conference on Management of Data. ACM.
Davis, C. (2013). Graphite - Scalable Realtime Graphing.
http://graphite.wikidot.com.
Dean, J. and Lopes, J. (2004). MapReduce: Simplified Data
Processing on Large Clusters. In OSDI’04, 6th Sym-
posium on Operating Systems Design and Implemen-
tation. USENIX Association.
Gantz, J. and Reinsel, D. (2012). The digi-
tal universe in 2020: Big data, bigger dig-
ital shadows, and biggest growth in the far
east. http://www.emc.com/leadership/digital-
universe/iview/big-data-2020.htm.
George, L. (2011). HBase: the definitive guide. O’Reilly
Media, Sebastopol, CA.
Hasselmeyer., P. and d’Heureuse, N. (2010). Towards holis-
tic multi-tenant monitoring for virtual data centers. In
NOMS’10, 2010 IEEE/IFIP Network Operations and
Management Symposium Workshops. IEEE Computer
Society.
Hoffman, S. and Souza, S. D. (2013). Apache Flume: Dis-
tributed Log Collection for Hadoop. Packt Publishing,
Birmingham, UK.
Josephsen, D. (2007). Building a Monitoring Infrastructure
with Nagios. Prentice Hall, Upper Saddle River, NJ.
Keller, A. and Ludwig, H. (2003). The WSLA Framework:
Specifying and Monitoring Service Level Agreements
for Web Services. Journal of Network and Systems
Management.
Kundu, D. and Lavlu, S. (2009). Cacti 0.8 Network Moni-
toring. Packt Publishing, Birmingham, UK.
Leu, J. S., Yee, Y. S., and Chen, W. L. (2010). Compar-
ison of Map-Reduce and SQL on Large-Scale Data
Processing. In ISPA’10, 1st International Symposium
on Parallel and Distributed Processing with Applica-
tions. IEEE Computer Society.
Litvinova, A., Engelmann, C., and Scott, S. L. (2010).
A proactive fault tolerance framework for high-
performance computing. In PDCN’10, 9th IASTED
International Conference on Parallel and Distributed
Computing and Networks (PDCN2010). ACTA Press.
Lv, Q., Cao, P., Cohen, E., Li, K., and Shenker, S. (2002).
Search and replication in unstructured peer-to-peer
networks. In ICS’02, 16th International Conference
on Supercomputing. ACM.
Marchetti, M., Colajanni, M., and Messori, M. (2010). Se-
lective and early threat detectionin large networked
systems. In CIT’10, 10th IEEE International Confer-
ence on Computer and Information Technology. IEEE
Computer Society.
Massie, M. L., Chun, B. N., and Culler, D. E. (2004). The
Ganglia Distributed Monitoring System: Design, Im-
plementation, and Experience. Parallel Computing.
Olston, C. et al. (2008). Pig Latin: a not-so-foreign lan-
guage for data processing. In SIGMOD’08, 2008 ACM
SIGMOD International Conference on Management
of Data, New York, NY. ACM.
Olups, R. (2010). Zabbix 1.8 network monitoring. Packt
Publishing, Birmingham, UK.
Rabkin, A. and Katz, R. (2010). Chukwa: a system for re-
liable large-scale log collection. In LISA’10, 24th In-
ternational Conference on Large Installation System
Administration. USENIX Association.
Renesse, R. V., Birman, K. P., and Vogels, W. (2003). Astro-
labe: A robust and scalable technology for distributed
system monitoring, management, and data mining.
ACM Transactions on Computer Systems.
Rowstron, A. and Druschel, P. (2001). Pastry: Scalable,
decentralized object location, and routing for large-
scale peer-to-peer systems. MIDDLEWARE’01, 3rd
IFIP/ACM International Conference on Distributed
Systems Platforms.
Sacerdoti, F. D., Katz, M. J., Massie, M. L., and Culler,
D. E. (2003). Wide Area Cluster Monitoring with
Ganglia. Cluster Computing.
Shvachko, K. et al. (2010). The Hadoop Distributed File
System. In MSST’10, 26th Symposium on Massive
Storage Systems and Technologies. IEEE Computer
Society.
Sigoure, B. (2010). OpenTSDB, a distributed, scalable
Time Series Database. http://opentsdb.net.
Surhone, L. M., Tennoe, M. T., and Henssonow, S. F.
(2011). OpenNMS. Betascript Publishing, Mauritius.
Voicu, R., Newman, H., and Cirstoiu, C. (2009). MonAL-
ISA: An agent based, dynamic service system to mon-
itor, control and optimize distributed systems. Com-
puter Physics Communications.
Zyrion (2010-2013). Traverse: distributed,
scalable, high-availability architecture.
http://www.zyrion.com/company/whitepapers.
MonitoringLargeCloud-BasedSystems
351