String Searching in Referentially Compressed Genomes

Sebastian Wandelt; Ulf Leser

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

String Searching in Referentially Compressed Genomes

Topics: Bioinformatics & Pattern Discovery; Optimization

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 0IC3K, 95-102, 2012 , Barcelona, Spain

Authors: Sebastian Wandelt and Ulf Leser

Affiliation: Humboldt-Universität zu Berlin, Germany

Keyword(s): Genome Compression, Referential Compression, String Search.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; BioInformatics & Pattern Discovery ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Methodologies and Technologies ; Operational Research ; Optimization ; Symbolic Systems

Abstract: Background: Improved sequencing techniques have led to large amounts of biological sequence data. One of the challenges in managing sequence data is efficient storage. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. However, so far sequences always have to be decompressed prior to an analysis. There is a need for algorithms working on compressed data directly, avoiding costly decompression. Summary: In our work, we address this problem by proposing an algorithm for exact string search over compressed data. The algorithm works directly on referentially compressed genome sequences, without needing an index for each genome and only using partial decompression. Results: Our string search algorithm for referentially compressed genomes performs exact string matching for large sets of genomes faster than using an index structure, e.g. suffix trees, for each genome , especially for short queries. We think that this is an important step towards space and runtime efficient management of large biological data sets. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.119

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Wandelt, S., Leser and U. (2012). String Searching in Referentially Compressed Genomes. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2012) - KDIR; ISBN 978-989-8565-29-7; ISSN 2184-3228, SciTePress, pages 95-102. DOI: 10.5220/0004143400950102

@conference{kdir12,
author={Sebastian Wandelt and Ulf Leser},
title={String Searching in Referentially Compressed Genomes},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2012) - KDIR},
year={2012},
pages={95-102},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004143400950102},
isbn={978-989-8565-29-7},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2012) - KDIR
TI - String Searching in Referentially Compressed Genomes
SN - 978-989-8565-29-7
IS - 2184-3228
AU - Wandelt, S.
AU - Leser, U.
PY - 2012
SP - 95
EP - 102
DO - 10.5220/0004143400950102
PB - SciTePress