Authors:
Yuki Endo
1
;
Fubito Toyama
1
;
Chikafumi Chiba
2
;
Hiroshi Mori
1
and
Kenji Shoji
1
Affiliations:
1
Utsunomiya University, Japan
;
2
University of Tsukuba, Japan
Keyword(s):
Bioinfomatics, Next Generation Sequencing, De Novo Assembly
Related
Ontology
Subjects/Areas/Topics:
Bioinformatics
;
Biomedical Engineering
;
Genomics and Proteomics
;
Next Generation Sequencing
;
Sequence Analysis
Abstract:
Determining whole genome sequences of various species has many applications not only in biological system,
but also in medicine, pharmacy and agriculture. In recent years, the emergence of high-throughput next generation
sequencing technologies has dramatically reduced time and costs for whole genome sequencing.
These new technologies provide ultrahigh throughput with lower unit data cost. However, the data are very
short length fragments of DNA. Thus, developing algorithms for merging these fragments is very important.
Merging these fragments without reference data is called de novo assembly. Many algorithms for de novo
assembly have been proposed in recent years. Velvet, one of the algorithms, is famous because it has good
performance in terms of memory and time consumption. But memory consumption increases dramatically
when the size of input fragments is huge. Therefore, it is necessary to develop algorithm with low memory
usage. In this paper, we propose an algorithm for
de novo assembly with lower memory. In our experiments
using E.coli K-12 strain MG 1655, memory consumption of the proposed algorithm was one-third of that of
Velvet.
(More)