Authors:
Issei Hamada
1
;
Takaharu Shimada
1
;
Daiki Nakata
1
;
Kouichi Hirata
1
and
Tetsuji Kuboyama
2
Affiliations:
1
Kyushu Institute of Technology, Japan
;
2
Gakushuin University, Japan
Keyword(s):
Kernels, Nucleotide Sequences, Positions in Nucleotide Sequences, Phylogenetic Trees.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Bioinformatics and Systems Biology
;
Kernel Methods
;
Pattern Recognition
;
Software Engineering
;
Theory and Methods
Abstract:
In this paper, we classify nucleotide sequences and their positions of influenza A viruses by using both nucleotide sequence kernels and phylogenetic tree kernels. In the nucleotide sequence kernel, we regard a nucleotide sequence as a vector, a multiset and a string. In the phylogenetic tree kernel, we use a relabeled phylogenetic tree obtained by replacing the labels of leaves that are indices of nucleotide sequences in the reconstructed phylogenetic tree from a set of nucleotide sequences with the nucleotides at a fixed position and trimmed phylogenetic trees obtained by trimming the branches in the relabeled phylogenetic tree with same leaves as possible. Then, we observe which of kernels are effective the classification of nucleotide sequences as analyzing pandemic occurrences and regions and the classification of positions in nucleotide sequences as analyzing positions in packaging signals.