Authors:
Kenji Satou
1
;
Yoshiki Shimaguchi
2
;
Kunti Robiatul Mahmudah
2
;
Ngoc Giang Nguyen
2
;
Mera Kartika Kartika Delimayanti
2
;
3
;
Bedy Purnama
4
;
2
;
Mamoru Kubo
1
;
Makiko Kakikawa
1
and
Yoichi Yamada
1
Affiliations:
1
Institute of Science and Engineering, Kanazawa University, Kanazawa, Japan
;
2
Graduate School of Natural Science and Technology, Kanazawa University, Kanazawa, Japan
;
3
Department of Computer and Informatics Engineering, Politeknik Negeri Jakarta, Jakarta, Indonesia
;
4
Telkom School of Computing, TELKOM University, Bandung, Indonesia
Keyword(s):
Nuclear Protein, Subnuclear Location, Deep Learning, Feature Selection.
Abstract:
To play a biomolecular function, a protein must be transported to a specific location of cell. Also in a nucleus, a nuclear protein has its own location to fulfil its role. In this study, subnuclear location of nuclear protein was predicted from protein sequence by using deep learning algorithm. As a dataset for experiments, 319 non-homologous protein sequences with class labels corresponding to 13 classes of subcellular localization (e.g. "Nuclear envelope") were selected from public databases. In order to achieve better performance, various combinations of feature generation methods, classification algorithms, parameter tuning, and feature selection were tested. Among 17 methods for generating features of protein sequences, Composition/Transition/Distribution (CTD) generated the most effective features. They were further selected by randomForest package for R. Using the selected features, quite high accuracy (99.91%) was achieved by a deep neural network with seven hidden layers, m
axout activation function, and RMSprop optimization algorithm.
(More)