REFERENCES
Arita, M., Tsuda, K., and Asai, K. (2002). Modeling splic-
ing sites with pairwise correlations.
Baten, A., Chang, B., Halgamuge, S., and Li, J. (2006).
Splice site identification using probabilistic parame-
ters and SVM classification.
Baten, A. K., Halgamuge, S. K., Chang, B., and Wickra-
marachchi, N. (2007). Biological Sequence Data Pre-
processing for Classification: A Case Study in Splice
Site Identification. In Proceedings of the 4th inter-
national symposium on Neural Networks: Part II–
Advances in Neural Networks, ISNN ’07, pages 1221–
1230, Berlin, Heidelberg. Springer-Verlag.
Bernal, A., Crammer, K., Hatzigeorgiou, A., and Pereira, F.
(2007). Global Discriminative Learning for Higher-
Accuracy Computational Gene Prediction. PLoS
Comput Biol, 3(3):e54.
Brown, M. P. S., Grundy, W. N., Lin, D., Cristianini, N.,
Sugnet, C., Furey, T. S., M.Ares, J., and Haussler,
D. (2000). Knowledge-based Analysis of Microar-
ray Gene Expression Data Using Support Vector Ma-
chines. PNAS, 97(1):262–267.
Cai, D., Delcher, A., Kao, B., and Kasif, S. (2000). Model-
ing splice sites with Bayes networks. Bioinformatics,
16(2):152–158.
Dai, W., Xue, G., Yang, Q., and Yu, Y. (2007). Transfer-
ring Na
¨
ıve Bayes Classifiers for Text Classification.
In Proceedings of the 22nd AAAI Conference on Arti-
ficial Intelligence.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977).
Maximum Likelihood from Incomplete Data via the
EM Algorithm. Journal of the Royal Statistical Soci-
ety. Series B (Methodological), 39(1):1–38.
Gantz, J. H., Reinsel, D., Chute, C., Schlinchting, W.,
McArthur, J., Minton, S., Xheneti, I., Toncheva, A.,
and Manfrediz, A. (2007). The Expanding Digital
Universe.
Herndon, N. and Caragea, D. (2013a). Na
¨
ıve Bayes Domain
Adaptation for Biological Sequences. In Proceedings
of the 4th International Conference on Bioinformatics
Models, Methods and Algorithms, BIOINFORMAT-
ICS 2013, pages 62–70.
Herndon, N. and Caragea, D. (2013b). Predicting pro-
tein localization using a domain adaptation approach.
Communications in Computer and Information Sci-
ence (CCIS 2013). Springer-Verlag.
Li, J., Wang, L., Wang, H., Bai, L., and Yuan, Z. (2012).
High-accuracy splice site prediction based on se-
quence component and position features. Genet Mol
Res, 11(3):3432–51.
Maeireizo, B., Litman, D., and Hwa, R. (2004). Co-training
for predicting emotions with spoken dialogue data. In
Proceedings of the ACL 2004 on Interactive poster
and demonstration sessions, ACLdemo ’04, Strouds-
burg, PA, USA. Association for Computational Lin-
guistics.
Mccallum, A. and Nigam, K. (1998). A Comparison of
Event Models for Na
¨
ıve Bayes Text Classification. In
AAAI-98 Workshop on ’Learning for Text Categoriza-
tion’.
M
¨
uller, K.-R., Mika, S., R
¨
atsch, G., Tsuda, S., and
Sch
¨
olkopf, B. (2001). An Introduction to Kernel-
Based learning Algorithms. IEEE Transactions on
Neural Networks, 12(2):181–202.
Noble, W. S. (2006). What is a support vector machine?
Nat Biotech, 24(12):1565–1567.
R
¨
atsch, G. and Sonnenburg, S. (2004). Accurate Splice
Site Prediction for Caenorhabditis Elegans. In Kernel
Methods in Computational Biology, MIT Press series
on Computational Molecular Biology, pages 277–298.
MIT Press.
R
¨
atsch, G., Sonnenburg, S., Srinivasan, J., Witte, H.,
M
¨
uller, K.-R., Sommer, R., and Sch
¨
olkopf, B. (2007).
Improving the C. elegans genome annotation using
machine learning. PLoS Computational Biology,
3:e20.
Riloff, E., Wiebe, J., and Wilson, T. (2003). Learning sub-
jective nouns using extraction pattern bootstrapping.
In Proceedings of the seventh conference on Natural
language learning at HLT-NAACL 2003 - Volume 4,
CONLL ’03, pages 25–32, Stroudsburg, PA, USA.
Association for Computational Linguistics.
Schweikert, G., Widmer, C., Sch
¨
olkopf, B., and R
¨
atsch, G.
(2008). An Empirical Analysis of Domain Adapta-
tion Algorithms for Genomic Sequence Analysis. In
NIPS’08, pages 1433–1440.
Shannon, C. E. (1948). A mathematical theory of commu-
nication. The Bell System Technical Journal, 27:379–
423, 623–656.
Sonnenburg, S., Schweikert, G., Philips, P., Behr, J., and
R
¨
atsch, G. (2007). Accurate Splice site Prediction Us-
ing Support Vector Machines. BMC Bioinformatics,
8(Supplement 10):1–16.
Tan, S., Cheng, X., Wang, Y., and Xu, H. (2009). Adapt-
ing Na
¨
ıve Bayes to Domain Adaptation for Sentiment
Analysis. In Proceedings of the 31th European Con-
ference on IR Research on Advances in Information
Retrieval, ECIR ’09, pages 337–349, Berlin, Heidel-
berg. Springer-Verlag.
Yarowsky, D. (1995). Unsupervised word sense disam-
biguation rivaling supervised methods. In Proceed-
ings of the 33rd annual meeting on Association for
Computational Linguistics, ACL ’95, pages 189–196,
Stroudsburg, PA, USA. Association for Computa-
tional Linguistics.
Zhang, Y., Chu, C.-H., Chen, Y., Zha, H., and Ji, X. (2006).
Splice site prediction using support vector machines
with a Bayes kernel. Expert Syst. Appl., 30(1):73–81.
Zien, A., R
¨
atsch, G., Mika, S., Sch
¨
olkopf, B., Lengauer, T.,
and M
¨
uller, K.-R. (2000). Engineering support vec-
tor machine kernels that recognize translation initia-
tion sites. Bioinformatics, 16(9):799–807.
APPENDIX
In this appendix we show the trends in our classifier
based on the size of the target labeled dataset and the
features used.
BIOINFORMATICS2014-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms
66