loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Hao Wan ; Carolina Ruiz and Joseph Beck

Affiliation: Worcester Polytechnic Institute, United States

Keyword(s): Sequence Classification, Feature Generation, Mutated Subsequences.

Related Ontology Subjects/Areas/Topics: Algorithms and Software Tools ; Bioinformatics ; Biomedical Engineering ; Data Mining and Machine Learning ; Pattern Recognition, Clustering and Classification ; Sequence Analysis

Abstract: In this paper, we present a new feature generation algorithm for sequence data sets called Mutated Subsequence Generation (MSG). Given a data set of sequences, the MSG algorithm generates features from these sequences by incorporating mutative positions in subsequences. We compare this algorithm with other sequence-based feature generation algorithms, including position-based, k-grams, and k-gapped pairs. Our experiments show that the MSG algorithm outperforms these other algorithms in domains in which presence, not specific location, of sequential patterns discriminate among classes in a data set.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.119.192.2

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Wan, H.; Ruiz, C. and Beck, J. (2014). A Novel Feature Generation Method for Sequence Classification - Mutated Subsequence Generation. In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2014) - BIOINFORMATICS; ISBN 978-989-758-012-3; ISSN 2184-4305, SciTePress, pages 68-79. DOI: 10.5220/0004808200680079

@conference{bioinformatics14,
author={Hao Wan. and Carolina Ruiz. and Joseph Beck.},
title={A Novel Feature Generation Method for Sequence Classification - Mutated Subsequence Generation},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2014) - BIOINFORMATICS},
year={2014},
pages={68-79},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004808200680079},
isbn={978-989-758-012-3},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOSTEC 2014) - BIOINFORMATICS
TI - A Novel Feature Generation Method for Sequence Classification - Mutated Subsequence Generation
SN - 978-989-758-012-3
IS - 2184-4305
AU - Wan, H.
AU - Ruiz, C.
AU - Beck, J.
PY - 2014
SP - 68
EP - 79
DO - 10.5220/0004808200680079
PB - SciTePress