Authors:
Julio Cesar Santos dos Anjos
1
;
Bruno Reckziegel Filho
1
;
Junior F. Barros
1
;
Raffael B. Schemmer
1
;
Claudio Geyer
1
and
Ursula Matte
2
Affiliations:
1
Universidade Federal do Rio Grande do Sul, Brazil
;
2
Laboratory of Gene Therapy Hospital de Clinicas de Porto Alegre, Brazil
Keyword(s):
Data Intensive Computing, MapReduce, Parallel Distributed Processing, Genome Annotation.
Related
Ontology
Subjects/Areas/Topics:
Coupling and Integrating Heterogeneous Data Sources
;
Data Engineering
;
Databases and Data Security
;
Databases and Information Systems Integration
;
e-Business
;
Enterprise Information Systems
;
Information Systems Analysis and Specification
;
Large Scale Databases
;
Middleware Integration
;
Middleware Platforms
;
Modeling of Distributed Systems
;
Technology Platforms
;
Tools, Techniques and Methodologies for System Development
Abstract:
The development of sophisticated sequencing machines and DNA techniques has enabled advances to be made
in the medical field of genetics research. However, due to the large amount of data that sequencers produce,
new methods and programs are required to allow an efficient and rapid analysis of the data. MapReduce is a
data-intensive computing model that handles large volumes that are easy to program by means of two basic
functions (Map and Reduce). This work introduces GMS, a genetic mapping system that can assist doctors in
the clinical diagnosis of patients by conducting an analysis of the genetic mutations contained in their DNA.
As a result, the model can offer a good method for analyzing the data generated by sequencers, by providing a
scalable system that can handle a large amount of data. The use of several medical databases at the same time
makes it possible to determine susceptibilities to diseases through big data analysis mechanisms. The results
show scalability and offer
a possible diagnosis that can improve the genetic diagnosis with a powerful tool for
health professionals.
(More)