We also observe that, in case of MOBS almost all
of the suffix-prefix overlaps among remaining assem-
blies are true overlaps. By true overlap we mean that
these overlaps are present among assemblies, when
the assemblies are aligned against genome. Thus
even if MOBS may not report the best performance
on based on length of the assemblies, the suffix-prefix
overlaps among assemblies can be used to generate
bigger assemblies.
While MOBS runs reasonably fast, time compari-
son is not very meaningful as all the other assemblers
that report faster times seem to be multi-threaded.
MOBS at present has a single threaded implementa-
In this paper, we presented a method to generate as-
semblies from short reads using only short length
overlaps. This approach produces comparable results
while reducing the computational effort. There are
many possibilities for further improvement of results
using this approach. Generating assemblies that are
not contained in others is one. Developing algorithms
that generate larger assemblies is another and how do
we need to modify our algorithm to handle challenges
in real data such as error in reads and reads from both
strands of genome.
Comparisons given here are only indicative of the
promise of the approach and should not be taken as
the final word as some of the assemblers, used in the
comparison, do not give an option to set the error
model. We are working to extend this technique and
a full and final version will have its results on the real
This work is a part of the ongoing research program
on de novo genome assembly of Prof. S.N. Mahesh-
wari at IIT Delhi.We thank Prof. Maheshwari for his
guidance and support. We are also grateful to Prof.
Sanjiva Prasad for useful discussions. This work
has been partly supported from his project “Founda-
tions of Trusted and Scalable ’Last-Mile’ Healthcare”
funded by DeitY, Government of India.
