issues related to healthcare applications and carry
out experimental study regarding the performance of
two big data solutions with particular consideration
of healthcare applications.
8 CONCLUSIONS
This paper has studied the performance of two big
data solutions in healthcare domain. We have
considered a healthcare data model proposed on the
basis of national and international EHR standards
and demonstrate its mapping onto two big data
solutions, namely Cassandra and Hadoop. The
performance of a representative set of queries has
been studied using Hadoop. Further, we have also
carried out a comparative study of query
performance in Cassandra and in Hadoop systems.
It is observed that the mapping strategy is an
important issue in improving performance of any big
data solution. However, this should be based on
some queries which are considered to be frequent in
the application domain. It is also observed that
Hadoop performs better with large data sets in
comparison with Cassandra. The observations from
the experimental study clearly establish the fact that
Hadoop is more focussed on data processing and
therefore, scheduling strategy is important in case of
Hadoop. On the other hand, because data storage
and distribution is the main goal in Cassandra,
implementation is Cassandra should consider the
data mapping strategy as the primary issue.
In future, our goal is to develop algorithms for
transformation of data models to big data solutions.
In order to combine the strengths of Hadoop (in
terms of data processing) and Cassandra (in terms of
storage), it is planned to extend the work to use
Hadoop integrated with Cassandra. Other big data
solutions, like Mongodb and Hbase may also be
considered and their performances can be compared.
It is interesting to investigate the impact of other file
formats used to store data on the performance of
Hadoop Map-Reduce framework.
REFERENCES
Aydin. G., Hallac I.R., and Karakus B. (2015)
Architecture and Implementation of a Scalable Sensor
Data Storage and Analysis System Using Cloud
Computing and Big Data Technologies. Journal of
Sensors, Volume 2015, Article ID 834217, Hindwai
Publishing Corporation.
Belle A., Thiagarajan R., Soroushmehr S.M.R., Navidi F.,
Beard D.A., and Najarian K. (2015) Big Data
Analytics in Healthcare, BioMed research
international. Volume 2015, Article ID 370194,
Hindwai Publishing Corporation.
Bezerra A., Hernández P., Espinosa A., and Carlos J.
(2013) Job scheduling for optimizing data locality in
Hadoop clusters. Proceedings of the 20th European
MPI Users' Group Meeting (EuroMPI'13). ACM, New
York, NY, USA, pp 271-276.
Guo Z., Fox G., and Zhou M. (2012) Investigation of data
locality and fairness in MapReduce, In Proceedings of
third international workshop on MapReduce and its
Applications Date, pp. 25-32. ACM.
Lourenço J.R., Cabral B., Carreiro P., Vieira M., and
Bernardino J. (2015) Choosing the right NoSQL
database for the job: a quality attribute evaluation.
Journal of Big Data, 2 (1), pp 1-26.
Manoj V. (2014) Comparative study of NoSQL
Document, Column Store Databases And Evaluation
Of Cassandra. International Journal of Database
Management Systems, 6 (4), pp11-26.
Ministry of Health and family Welfare, Government of
India (2013) Approved “Electronic Health Record
Standards for India”, August 2013.
Mukherjee, N., Bhunia, S. S., and Sil Sen, P. (2014) A
Sensor-Cloud Framework for Provisioning Remote
Health-Care Services. Proceedings of the Computing
& Networking for Internet of Things (ComNet-IoT)
workshop co-located with 15th International
Conference on Distributed Computing and
Networking.
Naguri, K., Sil Sen P., Mukherjee, N. (2015) Design of a
Health-Data Model and a Query-driven
Implementation in Cassandra, Proceedings of the 3rd
International Workshop on Service Science for e-
Health (SSH), co-located with IEEE HealthCom.
Patel J. (2012) (Online)
www.ebaytechblog.com/2012/07/16/cassandra-data-
modeling-best-practices-part-1/ & -part-2/
Sil Sen, P., Mukherjee, N. (2014) Standards of EHR and
their scope of implementation in a sensor-cloud
environment, Proceedings of the international
Conference on Medical Imaging, m-health and
Emerging Communication System (MedCom), IEEE,
pp241-246.