Authors:
Hugo Saldanha
;
Edward Ribeiro
;
Maristela Holanda
;
Aleteia Araujo
;
Genaina Rodrigues
;
Maria Emilia Walter
;
João Carlos Setubal
and
Alberto Dávila
Affiliation:
University of Brasilia, Virginia Bioinformatics Institute and FIOCRUZ, Brazil
Keyword(s):
Cloud architecture, Bioinformatics workflow, High-throughput genome sequencing.
Related
Ontology
Subjects/Areas/Topics:
Cloud Computing
;
Cloud Computing Architecture
;
Cloud Middleware Frameworks
;
Cloud Standards
;
Collaboration and e-Services
;
Communication and Software Technologies and Architectures
;
Context
;
Data Engineering
;
e-Business
;
Enterprise Information Systems
;
Fundamentals
;
Languages, Tools and Architectures
;
Mobile Software and Services
;
Model-Driven Software Development
;
Ontologies and the Semantic Web
;
Paradigm Trends
;
Platforms and Applications
;
SAAS, PAAS, IAAS
;
Service Composition and Mashups
;
Service-Oriented Architectures
;
Services Science
;
Software Agents and Internet Computing
;
Software Engineering
;
Software Engineering Methods and Techniques
;
Technology Platforms
;
Telecommunications
;
Web Services
;
Wireless Information Networks and Systems
Abstract:
Cloud computing has emerged as a promising platform for large scale data intensive scientific research, i.e., processing tasks that use hundreds of hours of CPU time and petabytes of data storage. Despite being object of current research, efforts are mainly based on MapReduce in order to have processing performed in clouds. This article describes the BioNimbus project, which aims to define an architecture and to create a framework for easy and flexible integration and support for distributed execution of bioinformatics tools in a cloud environment, not only tied to the MapReduce paradigm. As a result, we leverage cloud elasticity, fault tolerance and, at the same time, significantly improve the storage capacity and execution time of bioinformatics tasks, mainly of large scale genome sequencing projects.