Classifying Nucleotide Sequences and their Positions of Influenza A Viruses through Several Kernels

Issei Hamada, Takaharu Shimada, Daiki Nakata, Kouichi Hirata, Tetsuji Kuboyama

Abstract

In this paper, we classify nucleotide sequences and their positions of influenza A viruses by using both nucleotide sequence kernels and phylogenetic tree kernels. In the nucleotide sequence kernel, we regard a nucleotide sequence as a vector, a multiset and a string. In the phylogenetic tree kernel, we use a relabeled phylogenetic tree obtained by replacing the labels of leaves that are indices of nucleotide sequences in the reconstructed phylogenetic tree from a set of nucleotide sequences with the nucleotides at a fixed position and trimmed phylogenetic trees obtained by trimming the branches in the relabeled phylogenetic tree with same leaves as possible. Then, we observe which of kernels are effective the classification of nucleotide sequences as analyzing pandemic occurrences and regions and the classification of positions in nucleotide sequences as analyzing positions in packaging signals.

References

  1. Bao, Y., Bolotov, P., Dernovoy, D., Kiryutin, B., Zaslavsky, L., Tatusova, T., Ostell, J., and Lipman, D. (2008). The influenza virus resource at the National Center for Biotechnology Information. J. Virol., 82:596-601. Also available at: http://www.ncbi.nlm.gov/genomes/FLU/.
  2. Chang, C.-C. and Lin, C.-J. (2013). LIBSVM - A library for support vector machine (version 3.17). Available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm.
  3. Durbin, R., Eddy, S., Krogh, A., and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press.
  4. Gärtner, T. (2008). Kernels for structured data. World Scientific.
  5. Hamada, I., Shimada, T., Hirata, K., and Kuboyama, T. (2013). Agreement subtree mapping kernel for phylogenetic trees. In Proc. DDS 13, pages 1-8.
  6. Hutchinson, E. C., von Kirchbach, J. C., Gog, J. R., and Digard, P. (2010). Genome packaging in influenza A virus. J. Gen. Virol., 91:313-328.
  7. Leslie, C. S., Eskin, E., and Noble, W. S. (2002). The spectrum kernel: A string kernel for svm protein classification. In Proc. PSB 2002, pages 566-575.
  8. Makino, S., Shimada, T., Hirata, K., Yonezawa, K., and Ito, K. (2012a). A trim distance between positions as packaging signals in H3N2 influenza viruses. In Proc. SCIS-ISIS 2012, pages 1702-1707.
  9. Makino, S., Shimada, T., Hirata, K., Yonezawa, K., and Ito, K. (2012b). A trim distance between positions in nucleotide sequences. In Proc. DS 2012 (LNAI 2569), pages 81-94.
  10. Shimada, T., Hamada, I., Hirata, K., Kuboyama, T., Yonezawa, K., and Ito, K. (2013). Clustering of positions in nucleotide sequences by trim distance. In Proc. IIAI AAI 2013, pages 129-134.
  11. Shimada, T., Hazemoto, T., Makino, S., Hirata, K., and Ito, K. (2012). Finding correlated mutations among rna segments in H3N2 influenza viruses. In Proc. SCISISIS 2012, pages 1696-1705.
  12. Sung, W.-K. (2009). Algorithms in bioinformatics: A practical introduction. Chapman & Hall/CRC.
Download


Paper Citation


in Harvard Style

Hamada I., Shimada T., Nakata D., Hirata K. and Kuboyama T. (2015). Classifying Nucleotide Sequences and their Positions of Influenza A Viruses through Several Kernels . In Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-076-5, pages 342-347. DOI: 10.5220/0005251103420347


in Bibtex Style

@conference{icpram15,
author={Issei Hamada and Takaharu Shimada and Daiki Nakata and Kouichi Hirata and Tetsuji Kuboyama},
title={Classifying Nucleotide Sequences and their Positions of Influenza A Viruses through Several Kernels},
booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2015},
pages={342-347},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005251103420347},
isbn={978-989-758-076-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Classifying Nucleotide Sequences and their Positions of Influenza A Viruses through Several Kernels
SN - 978-989-758-076-5
AU - Hamada I.
AU - Shimada T.
AU - Nakata D.
AU - Hirata K.
AU - Kuboyama T.
PY - 2015
SP - 342
EP - 347
DO - 10.5220/0005251103420347