Table 2: Comparison of assemblies.
Assembler
Best k-mer
size (bp)
N50
(bp)
# of
contigs
Total
(kbp)
Genome
covered (%)
Genome covered
without gaps (%)
Error rate
(%)
Proposed method 55 2,801 127,326 91,796 76.733 93.298 7.764
SOAPdenovo2 63 4,005 43,032 85,648 78.695 95.684 0.128
Velvet
63 5,166 29,179 84,335 77.075 93.714 1.350
semblers. Further investigation is needed to improve
the N50 and error rate of contigs in our method by
modifying the path-tracing algorithm.
REFERENCES
Bowe, A., Onodera, T., Sadakane, K., and Shibuya, T.
(2012). Succinct de bruijn graphs. In WABI, volume
7534 of Lecture Notes in Computer Science, pages
225–235. Springer.
Butler, J., MacCallum, I., Kleber, M., Shlyakhter, I. A.,
Belmonte, M. K., Lander, E. S., Nusbaum, C., and
Jaffe, D. B. (2008). ALLPATHS: de novo assembly
of whole-genome shotgun microreads. Genome Res.,
18(5):810–820.
Chevreux, B., Pfisterer, T., Drescher, B., Driesel, A. J.,
Muller, W. E., Wetter, T., and Suhai, S. (2004). Using
the miraEST Assembler for Reliable and Automated
mRNA Transcript Assembly and SNP Detection in
Sequenced ESTs. Genome Res., 14(6):1147–1159.
Chikhi, R., Limasset, A., Jackman, S., Simpson, J., and
Medvedev, P. (2014). On the representation of de
bruijn graphs. In RECOMB, volume 8394 of Lecture
Notes in Computer Science, pages 35–55. Springer.
Chikhi, R. and Rizk, G. (2012). Space-efficient and exact
de bruijn graph representation based on a bloom filter.
In WABI, volume 7534 of Lecture Notes in Computer
Science, pages 236–248. Springer.
Conway, T. C. and Bromage, A. J. (2011). Succinct data
structures for assembling large genomes. Bioinfor-
matics, 27(4):479–486.
Endo, Y., Toyama, F., Chiba, C., Mori, H., and Shoji, K.
(2014). De Novo Short Read Assembly Algorithm
with Low Memory Usage. In Proceedings of Inter-
national Conference on Bioinformatics Models, Meth-
ods and Algorithms (BIOINFORMATICS2014), pages
215–200.
Hernandez, D., Francois, P., Farinelli, L., Osteras, M., and
Schrenzel, J. (2008). De novo bacterial genome se-
quencing: millions of very short reads assembled on a
desktop computer. Genome Res., 18(5):802–809.
Jeck, W. R., Reinhardt, J. A., Baltrus, D. A., Hicken-
botham, M. T., Magrini, V., Mardis, E. R., Dangl,
J. L., and Jones, C. D. (2007). Extending assembly
of short DNA sequences to handle error. Bioinformat-
ics, 23(21):2942–2944.
Li, R., Zhu, H., Ruan, J., Qian, W., Fang, X., Shi, Z., Li,
Y., Li, S., Shan, G., Kristiansen, K., Li, S., Yang, H.,
Wang, J., and Wang, J. (2010). De novo assembly
of human genomes with massively parallel short read
sequencing. Genome Res., 20(2):265–272.
Miller, J. R., Delcher, A. L., Koren, S., Venter, E., Walenz,
B. P., Brownley, A., Johnson, J., Li, K., Mobarry,
C., and Sutton, G. (2008). Aggressive assembly of
pyrosequencing reads with mates. Bioinformatics,
24(24):2818–2824.
Rizk, G., Lavenier, D., and Chikhi, R. (2013). Dsk: k-mer
counting with very low memory usage. Bioinformat-
ics, 29(5):652–653.
Salzberg, S. L., Phillippy, A. M., Zimin, A., Puiu, D.,
Magoc, T., Koren, S., Treangen, T. J., Schatz, M. C.,
Delcher, A. L., Roberts, M., et al. (2012). GAGE: A
critical evaluation of genome assemblies and assem-
bly algorithms. Genome Res., 22(3):557–567.
Simpson, J. T., Wong, K., Jackman, S. D., Schein, J. E.,
Jones, S. J., and Birol, I. (2009). ABySS: a parallel
assembler for short read sequence data. Genome Res.,
19(6):1117–1123.
Warren, R. L., Sutton, G. G., Jones, S. J., and Holt, R. A.
(2007). Assembling millions of short DNA sequences
using SSAKE. Bioinformatics, 23(4):500–501.
Zerbino, D. R. and Birney, E. (2008). Velvet: algorithms for
de novo short read assembly using de Bruijn graphs.
Genome Res., 18(5):821–829.