USA: University of California, School of Information
and Computer Science.
Chuzhanova, N. A., Jones, A. J., & Margetts, S. (1998).
Feature selection for genetic sequence classification.
Bioinformatics, 14(2), 139-143.
Damashek, M. (1995, Feb 10). Gauging Similarity with n-
Grams: Language-Independent Categorization of Text.
Science, 267(5199), 843-848.
Dong, G., & Pei, J. (2009). Sequence Data Mining.
Heidelberg: Springer-Verlag Berlin.
Gini, C. (1912). "Italian: Variabilità e
mutabilità"(Variability and Mutability). C. Cuppini,
Bologna, 156 pages. Reprinted in Memorie di
metodologica statistica (Ed. Pizetti E, Salvemini, T).
Rome: Libreria Eredi Virgilio Veschi (1955).
Hall, M. A., & Smith, L. A. (1999). Feature Selection For
Machine Learning: Comparing a Correlation-based
Filter Approach to the Wrapper. Proceedings of the
Twelfth International FLAIRS Conference, (pp. 235–
239). Orlando, FL.
Hall, M., Frank, E., Holmes, G., Pfahringer, B.,
Reutemann, P., & Witten, I. H. (2009). The WEKA
Data Mining Software: An Update. SIGKDD
Explorations, 11(1), 10-18.
Harley, C. B., & Reynolds, R. P. (1987). Analysis of E.
coli promoter sequences. Nucleic Acids Research,
15(5), 2343-2361.
Hawley, D. K., & McClure, W. R. (1983). Compilation
and analysis of Escherichia coli promoter DNA
sequences. Nucleic Acids Research, 11(8), 2237-2255.
Huang, S.-H., Liu, R.-S., Chen, C.-Y., Chao, Y.-T., &
Chen, S.-Y. (2005). Prediction of Outer Membrane
Proteins by Support Vector Machines Using
Combinations of Gapped Amino Acid Pair
Compositions. Proceedings of the 5th IEEE
Symposium on Bioinformatics and Bioengineering
(BIBE’05), (pp. 113-120 ).
Ji, X., Bailey, J., & Dong, G. (2005). Mining Minimal
Distinguishing Subsequence Patterns with Gap
Constraints. Proceedings of the Fifth IEEE
International Conference on Data Mining.
Kohavi, R., & Johnb, G. H. (1997). Wrappers for feature
selection. Artificial Intelligence, 97(1-2), 273-324.
Leslie, C. S., Eskin, E., Cohen, A., Weston, J., & Noble,
W. S. (2004). Mismatch string kernels for
discriminative protein classification. Bioinformatics,
20(4), 467-476.
Mah, A. K., Tu, D. K., Johnsen, R. C., Chu, J. S., Chen,
N., & Baillie, D. L. (2010). Characterization of the
octamer, a cis-regulatory element that modulates
excretory cell gene-expression in Caenorhabditis
elegans. BMC Molecular Biology, 11(19).
Noordewier, M. O., Towell, G. G., & Shavlik, J. W.
(1991). Training Knowledge-Based Neural Networks
to Recognize Genes in DNA Sequences. Advances in
Neural Information Processing Systems, 3.
Park, K.-J., & Kanehisa, M. (2003). Prediction of protein
subcellular locations by support vector machines using
compositions of amino acids and amino acid pairs.
Bioinformatics, 19(13), 1656-1663.
Reece-Hoyes, J. S., Shingles, J., Dupuy, D., Grove, C. A.,
Walhout, A. J., Vidal, M., & Hope, I. A. (2007).
Insight into transcription factor gene duplication from
Caenorhabditis elegans Promoterome-driven
expression patterns. BMC Genomics, 8(27).
Tan, P.-N., Kumar, V., & Steinbach, M. (2005).
Introduction to Data Mining. Boston, MA, USA:
Addison-Wesley.
Towell, G. G., Shavlik, J. W., & Noordewier, M. O.
(1990). Refinement of Approximate Domain Theories
by Knowledge-Based Neural Networks. In
Proceedings of the Eighth National Conference on
Artificial Intelligence, (pp. 861-866).
Wan, H., Barrett, G., Ruiz, C., & Ryder, E. F. (2013).
Mining Association Rules That Incorporate
Transcription Factor Binding Sites and Gene
Expression Patterns in C. elegans. In Proc. Fourth
International Conference on Bioinformatics Models,
Methods and Algorithms BIOINFORMATICS2013
(pp. 81-89). Barcelona, Spain. SciTePress.
WormBase, release WS230. (2012, April 1). Retrieved
from http://www.wormbase.org/
Xing, Z., Pei, J., & Keogh, E. (June 2010). A Brief Survey
on Sequence Classification. ACM SIGKDD
Explorations, 12(1), 40-48.
ANovelFeatureGenerationMethodforSequenceClassification-MutatedSubsequenceGeneration
79