Papers Papers/2022 Papers Papers/2022



Authors: Enrico Rossignolo and Matteo Comin

Affiliation: Department of Information Engineering, University of Padova, Padova, 35131, Italy

Keyword(s): k-mer Set, Compression, Smallest Path Cover.

Abstract: A fundamental operation within the realm of computational genomics revolves around the reduction of input sequences into their constituent k-mers. The development of space-efficient methods to represent a collection of k-mers assumes significant importance in advancing the scalability of bioinformatics analyses. One prevalent strategy involves transforming the set of k-mers into a de Bruijn graph and subsequently devising a streamlined representation of this graph by identifying the smallest path cover. In this article, we introduce USTAR2, a novel algorithm for the compression of k-mers. USTAR2 harnesses the principles of node connectivity in the de Bruijn graph, for a more efficient selection of paths for constructing the path cover. We performed a series of test on the compression of real read datasets, and compared USTAR2 with several other tools. USTAR2 achieved the best performance in terms of compression, it requires less memory and it is also considerably faster (up to 96x). The code of USTAR2 is available at the repository https://github.com/CominLab/USTAR2. (More)


Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Rossignolo, E. and Comin, M. (2024). USTAR2: Fast and Succinct Representation of k-mer Sets Using De Bruijn Graphs. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOINFORMATICS; ISBN 978-989-758-688-0; ISSN 2184-4305, SciTePress, pages 368-378. DOI: 10.5220/0012423100003657

author={Enrico Rossignolo. and Matteo Comin.},
title={USTAR2: Fast and Succinct Representation of k-mer Sets Using De Bruijn Graphs},
booktitle={Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOINFORMATICS},


JO - Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - BIOINFORMATICS
TI - USTAR2: Fast and Succinct Representation of k-mer Sets Using De Bruijn Graphs
SN - 978-989-758-688-0
IS - 2184-4305
AU - Rossignolo, E.
AU - Comin, M.
PY - 2024
SP - 368
EP - 378
DO - 10.5220/0012423100003657
PB - SciTePress