STATISTICAL ANALYSIS OF BIOMOLECULAR DATA USING UNICORE WORKFLOWS

Marcelina Borcz, Rafał Kluszczyński, Piotr Bała

Abstract

Nowadays the role of e-Science is important, especially in the area of life sciences. Experiments and their analysis are carried out in collaboration of many scientific groups from institutes located all over the world. Moreover, they work with immense amount of data which usually needs to be processed statistically. Therefore, the need for computing power is increasing. It usually can not be supplied by a standard laboratory. That is why e-Science makes use of grid technology. UNICORE (Uniform Interface to Computing Resources) is a middleware enabling access to the Grid resources in a seamless and secure way. In this paper we present UNICORE gridbean for statistical R environment which enables to process statistically data on the Grid. Being used as a part of more complex workflow task it can analyze results given by another applications and calculate needed statistics. By presenting example workflow constructed in UNICORE Rich Client application, authors show power of the Chemomentum workbench built on UNICORE Grid system.

References

  1. Borcz, M., KluszczyÁski, R., and Bala, P. (2007). BLAST Application on the GPE/UnicoreGS Grid. In et al., L., editor, Euro-Par 2006 Workshops: Parallel Processing, volume 4967 of LNCS, pages 245-253. Springer Berlin / Heidelberg.
  2. Bui, H., Botten, J., Fusseder, N., Pasquetto, V., Mothe, B., Buchmeier, M., and Sette, A. (2007). Protein sequence database for pathogenic arenaviruses. Immunome Research, 3.
  3. Fox, G. and Gannon, D. (2006). Special issue: Workflow in grid systems. Concurrency and Computation: Practice and Experience, 18(10):1009-1019.
  4. Grose, D., Crouchley, R., van Ark, T., Kewley, J., Allan, R., Braimah, A., and Hayes, M. (2006). sabreR: Gridenabling the analysis of multi-process random effect response data in R. Proc. Second International Conference on e-Social Science.
  5. Huerta, M., Haseltine, F., Liu, Y., Downing, G., and Seto, B. (2000). NIH working definition of Bioinformatics and Computational Biology.
  6. KluszczyÁski, R. and Bala, P. (2008). Supporting NAMD Application on the Grid using GPE. In et al., W., editor, PPAM 2007, volume 4967 of LNCS, pages 762- 769. Springer Berlin / Heidelberg.
  7. KluszczyÁski, R. and Bala, P. (2009). Supporting Clustal Application on the UNICORE Grid. Polish Journal of Environmental Studies, 18(3B):165-169.
  8. R Development Core Team (2005). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN: 3-900051-07-0.
  9. Streit, A. (2009). UNICORE: Getting to the heart of Grid technologies. eStrategies, Projects, 9th edition, pages 8-9.
  10. Wegenera, D., Sengstag, T., Sfakianakis, S., Rpinga, S., and Assi, A. (2009). GridR: An R-based tool for scientific data analysis in grid environments. Future Generation Computer Systems, 25:481-488.
  11. Wollenberg, K. (2005). Mutual information for protein sequence alignments. Package 'aaMI' for R environment.
  12. Yu, J. and Buyya, R. (2005). A taxonomy of scientific workflow systems for grid computing. SIGMOD Record, 34(3):44-49.
Download


Paper Citation


in Harvard Style

Borcz M., Kluszczyński R. and Bała P. (2010). STATISTICAL ANALYSIS OF BIOMOLECULAR DATA USING UNICORE WORKFLOWS . In Proceedings of the First International Conference on Bioinformatics - Volume 1: BIOINFORMATICS, (BIOSTEC 2010) ISBN 978-989-674-019-1, pages 217-220. DOI: 10.5220/0002742102170220


in Bibtex Style

@conference{bioinformatics10,
author={Marcelina Borcz and Rafał Kluszczyński and Piotr Bała},
title={STATISTICAL ANALYSIS OF BIOMOLECULAR DATA USING UNICORE WORKFLOWS},
booktitle={Proceedings of the First International Conference on Bioinformatics - Volume 1: BIOINFORMATICS, (BIOSTEC 2010)},
year={2010},
pages={217-220},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002742102170220},
isbn={978-989-674-019-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Bioinformatics - Volume 1: BIOINFORMATICS, (BIOSTEC 2010)
TI - STATISTICAL ANALYSIS OF BIOMOLECULAR DATA USING UNICORE WORKFLOWS
SN - 978-989-674-019-1
AU - Borcz M.
AU - Kluszczyński R.
AU - Bała P.
PY - 2010
SP - 217
EP - 220
DO - 10.5220/0002742102170220