PadeNA: A PARALLEL DE NOVO ASSEMBLER

Gaurav Thareja, Vivek Kumar, Mike Zyskowski, Simon Mercer, Bob Davidson

Abstract

Recent technological advances in DNA sequencing technology are resulting in ever-larger quantities of sequence information being made available to an increasingly broad segment of the scientific and clinical community. This is in turn driving the need for standard, rapid and easy to use tools for genomic reconstruction and analysis. As a step towards addressing this challenge, we present PadeNA (Parallel de Novo Assembler), a parallelized DNA sequence assembler with a graphical user interface. PadeNA is designed using interface-driven architecture to facilitate code reusability and extensibility, and is provided as part of the open source Microsoft Biology Foundation. Installers and documentation are available at http://research.microsoft.com/bio/.

References

  1. Altschul Stephen F., Madden Thomas L., Schaffer Alejandro A., Zhang Jinghui, Zhang Zheng, Miller Webb, & Lipman David J. 1997,78 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs', Nucleic Acids Res. 25:3389-3402.
  2. Batzoglou S., Jaffe D.B., Stanley K., Butler J., Gnerre S., Mauceli E., Berger B., Mesirov J. P., & Lander E. S., 2002, 'ARACHNE: a whole-genome shotgun assembler', Genome Research, 12:177-189.
  3. Biswas Surupa 2006, The Performance Benefits of NGen., Viewed July 5th 2010, < http://msdn. microsoft.com/ en-us/magazine/cc163610.aspx>
  4. Butler J., MacCallum I., Kleber M., Shlyakhter I. A., Belmonte M. K., Lander E. S., Nusbaum C. N., & Jaffe D. B., 2008, 'ALLPATHS: De novo assembly of whole-genome shotgun microreads', Genome Research, 18:810-820.
  5. Chaisson M.J. & Pevzner P.A., 2008, 'Short fragment assembly of bacterial genomes', Genome Research, pages 18:324-330.
  6. De Novo Assembly using Illumina reads - technical note: Illumina sequencing, 2009, retrieved July 5th 2010, <http://www.illumina .com/Documents/products/tech notes/technote_denovo_assembly.pdf>
  7. Green P., 1996, 'Documentation for Phrap. Technical report' Genome Center, University of Washington.
  8. Havlak P., Chen R., Durbin K. J., Egan A., & Ren Y., 2003, 'The atlas genome assembly system', Genome Research, 14:721-731.
  9. Huang X. & Madan A., 1999, 'CAP3: A whole-genome assembly program', Genome Research, 9:868-877.
  10. Huson Daniel H., Reinert Knut, & Myers Eugene W., 2002, 'The greedy path-merging algorithm for contig scaffolding', Journal of the ACM (JACM) archive, Volume 49, Issue 5.
  11. Kurtz S., Phillippy A., Delcher A. L., Smoot M., Shumway M., Antonescu C., & Salzberg S. L., 2004, 'Versatile and open software for comparing large genomes', Genome Biology.
  12. Mono: Cross platform, open source .NET development framework, 2004. Viewed July 5th 2010, < http://mono-project.com/Main_Page>
  13. Myers E. W., Sutton G. G., Delcher A. L., & Dew I. M., 2000, 'A whole-genome assembly of Drosophila', Science, 287(5461):2196-2204.
  14. Pattison Ted 1999, Understanding Interface-based Programming, Viewed July 5th 2010, < http://msdn.microsoft.com/en-us/library/aa 260635 (VS.60).aspx>
  15. Pevzner P. A., Tang H., & Waterman M. S., 2001, 'An eulerian path approach to DNA fragment assembly', Proceedings of the National Academy of Sciences, 98(17):9748-9753.
  16. Pop M., Kosack D. S., & Salzberg S. L., 2004, 'Hierarchical scaffolding with Bambus', Genome Research, 14 (1), pp. 149-159.
  17. Simpson J. T., Wong K., Jackman S. D., Schein J. E., Jones S. J., & Birol I., 2009, 'ABySS: A parallel assembler for short read sequence data', Genome Research.
  18. Sutton G. G., White O., Adams M. D., & Kerlavage A. R., 1995, 'TIGR assembler: A new tool for assembling large shotgun sequencing projects', Genome Science and Technology, 1:9-19.
  19. Zerbino D. & Birney E., 2008. 'Velvet: Algorithms for de novo short read assembly using de Bruijn graphs', Genome Research, 18:821-829
Download


Paper Citation


in Harvard Style

Thareja G., Kumar V., Zyskowski M., Mercer S. and Davidson B. (2011). PadeNA: A PARALLEL DE NOVO ASSEMBLER . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011) ISBN 978-989-8425-36-2, pages 196-203. DOI: 10.5220/0003164301960203


in Bibtex Style

@conference{bioinformatics11,
author={Gaurav Thareja and Vivek Kumar and Mike Zyskowski and Simon Mercer and Bob Davidson},
title={PadeNA: A PARALLEL DE NOVO ASSEMBLER},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011)},
year={2011},
pages={196-203},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003164301960203},
isbn={978-989-8425-36-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011)
TI - PadeNA: A PARALLEL DE NOVO ASSEMBLER
SN - 978-989-8425-36-2
AU - Thareja G.
AU - Kumar V.
AU - Zyskowski M.
AU - Mercer S.
AU - Davidson B.
PY - 2011
SP - 196
EP - 203
DO - 10.5220/0003164301960203