5 CONCLUSION
FastAPSP is the right choice for solving APSP when
data sets are relatively small or space-consumption is
not a big concern. In such cases, Readjoiner could
be an excellent choice for relatively small data sets
in a single-core machine. In other circumstances,
SOF could be a favorable choice. We present AOF
as a space-efficient tool which enables genome as-
sembler’s engineer to handle the overlap problem es-
pecially in machines with limited resources using
both Hamming distance and edit distance. Unlike
FM, AOF’s time consumption improves dramatically
when minimal length of an overlap (m) increases. De-
spite the fact that AOF is slower than FM in handling
some small genomic data sets, AOF can process large
data sets which cannot be handled by FM due to the
high space-consumption.
REFERENCES
Bellman, Richard (1954). The theory of dynamic program-
ming. Technical report, DTIC Document.
Bentley, J. L., Sleator, D. D., Tarjan, R. E., and Wei, V. K.
(1986). A locally adaptive data compression scheme.
Communications of the ACM, 29(4):320–330.
Burrows, M. and Wheeler, D. J. (1994). A block-sorting
lossless data compression algorithm. Technical report,
Digital SRC Research Report.
Ferragina, P., Manzini, G., M
¨
akinen, V., and Navarro, G.
(2004a). An alphabet-friendly fm-index. In Interna-
tional Symposium on String Processing and Informa-
tion Retrieval, pages 150–160. Springer.
Ferragina, P., Manzini, G., Veli, M., and Navarro, G.
(2004b). An alphabet-friendly FM-index. In SPIRE,
pages 150–160.
Gollery, Martin (2005). Bioinformatics: Sequence and
genome analysis, david w. mount. cold spring harbor,
ny: Cold spring harbor laboratory press, 2004, 692
pp., paperback. isbn 0-87969-712-1. Clinical Chem-
istry, 51(11):2219–2219.
Gonnella, G. and Kurtz, S. (2012). Readjoiner: a fast and
memory efficient string graph-based sequence assem-
bler. BMC Bioinformatics, 13:82.
Gusfield, D., Landau, G., and Schieber, B. (1992). An effi-
cient algorithm for the all pairs suffix-prefix problem.
Inf. Process. Lett., 41(4):181–185.
Haj Rachid, M. and Malluhi, Q. (2015). A practical
and scalable tool to find overlaps between sequences.
BioMed research international, 2015.
Haj Rachid, M., Malluhi, Q., and Abouelhoda, M. (2014a).
A space-efficient solution to find the maximum over-
lap using a compressed suffix array. In MECBME.
Haj Rachid, M., Malluhi, Q., and Abouelhoda, M. (2014b).
Using the Sadakane compressed suffix tree to solve
the all-pairs suffix prefix problem. BioMed Research
International.
Haj Rachid, Maan (2017). Two efficient techniques to find
approximate overlaps between sequences. BioMed
Research International, 2017.
K
¨
arkk
¨
ainen, Juha and Na, Joong Chae (2007). Faster filters
for approximate string matching. In ALENEX. SIAM.
Kucherov, Gregory and Tsur, Dekel (2014). Improved fil-
ters for the approximate suffix-prefix overlap problem.
In International Symposium on String Processing and
Information Retrieval, pages 139–148. Springer.
Levenshtein, Vladimir I (1966). Binary codes capable of
correcting deletions, insertions, and reversals. In So-
viet physics doklady, volume 10, pages 707–710.
Lim, J. (2018). A Practical Algorithm for the All-Pairs
Suffix-Prefix Problem. PhD thesis.
Lim, J. and Park, K. (2017). A fast algorithm for the all-
pairs suffix–prefix problem. Theoretical Computer
Science, 698:14–24.
Louza, F. A., Gog, S., Zanotto, L., Araujo, G., and Telles,
G. P. (2016). Parallel computation for the all-pairs
suffix-prefix problem. In International Symposium on
String Processing and Information Retrieval, pages
122–132. Springer.
Needleman, Saul B and Wunsch, Christian D (1970). A
general method applicable to the search for similari-
ties in the amino acid sequence of two proteins. Jour-
nal of molecular biology, 48(3):443–453.
Ohlebusch, E. and Gog, S. (2010). Efficient algorithms
for the all-pairs suffix-prefix problem and the all-
pairs substring-prefix problem. Inf. Process. Lett.,
110(3):123–128.
Simpson, J. and Durbin, R. (2012). Efficient de novo as-
sembly of large genomes using compressed data struc-
tures. Genome research, 22(3):549–556.
Tustumi, W. H., Gog, S., Telles, G. P., and Louza, F. A.
(2016). An improved algorithm for the all-pairs
suffix–prefix problem. Journal of Discrete Algo-
rithms, 37:34–43.
V
¨
alim
¨
aki, Niko and Ladra, Susana and M
¨
akinen, Veli
(2012). Approximate all-pairs suffix/prefix overlaps.
Information and Computation, 213:49–58.
Wu, Thomas D and Nacu, Serban (2010). Fast and snp-
tolerant detection of complex variants and splicing in
short reads. Bioinformatics, 26(7):873–881.
Latest Advances in Solving the All-Pairs Suffix Prefix Problem
181