Hence, the learning task in SVM can be formalized
as the following constrained optimization problem:
2
||w||
min
2
w
(5)
n ..., 2, 1,i 1,b)x(wy subject to
ii
(6)
This is also known as a convex optimization
problem, which can be solved by using the standard
Lagrange multiplier method:
N
1i
iii
2
p
1)-b)x(w(y-||w||
2
1
L
(7)
where parameters σ
are called the Lagrange
multipliers. With the Lagrange multipliers, the
decision function can be written as follows:
b)x),K(xysgn(f(x)
ii
n
1i
i
(8)
Additionally, the results of each classifier are
combined by majority voting, and classification of
unknown data is performed based on the class label
to obtain the most frequent votes. The mathematical
function of our ensemble method with k classifiers
can be written as:
))c ,(x)(fmax( argclass(x)
i
k
k
(9)
6 EXPECTED OUTCOME
At the end of this PhD research, a new ensemble
classification method will be available for predicting
prostate cancer from RNA-Seq data. Moreover, a
complete gene expression data simulator will be
developed. The simulator may help to research non-
parametric methods for cancer classification. The
research will lead to new insights for understanding
prostate cancer by identifying candidate genes from
high-dimensional gene expression data. If this is
proven successful, our approach could be applied to
other types of disease.
ACKNOWLEDGEMENT
This work was supported by the National Research
Foundation of Korea (NRF) grant funded by the
Korea government (MSIP) (No. 2008-0062611) and
the Basic Science Research Program through the
National Research Foundation of Korea (NRF)
funded by the Ministry of Science, ICT & Future
Planning (No.2013R1A2A2A01068923).
REFERENCES
Bullard, J., Purdom, E., Hansen, K., Dudoit, S., 2010.
Evaluation of statistical methods for normalization and
differential expression in mRNA-Seq experiments.
BMC Bioinformatics 11:94.
Kim, Y., Yoon, H., Kim, J., Kang, H., Min, B., Kim, S.,
Ha, Y., Kim, I., Ryu, K., Lee, S., Kim, W., 2013.
HOXA9, ISL1 and ALDH1A3 methylation patterns as
prognostic markers for nonmuscle invasive bladder
cancer: array-based DNA methylation and expression
profiling. International Journal of Cancer 133, 1135-
1143.
Metzker, M., 2010. Sequencing technologies – the next
generation. Nature Reviews Genetics, 11, 31-46.
Rapaport, F., Khanin, R., Liang, Y., Pirun, M., Krek, A.,
Zumbo, P., Mason, C., Socci, N., Betel, D., 2013.
Comprehensive evaluation of differential gene
expression analysis methods for RNA-Seq data.
Genome Biology, 14:R95.
Rahman, A., Verma, B., 2013, Ensemble Classifier
Generation using Non–uniform Layered Clustering
and Genetic Algorithm. Knowledge-Based System 43,
30-42.
Tumer, K., Ghosh, J., 1996. Classier combining: analytical
results and implications. Proc. Nat’l Conf. Artificial
Intelligence, Portland, Ore, 126-132.
Tumer, K., Oza, N., 1999. Decimated input ensembles for
improved generalization. International Joint
Conference on Neural Network 5, 3069-3074.
Bryll, R., Gutierrez-Osuna, R., Quek, F., 2003. Attribute
bagging: improving accuracy of classifier ensembles
by using random feature subsets, Pattern Recognition
36, 1291-1302.
Rokach, L., 2006. Genetic algorithm-based feature set
partitioning for classification problems, Pattern
Recognition 41, 1676-1700.
Rokach, L., 2010. Ensemble-based classifiers. Artif. Intell.
Rev. 33, 1-39.
Fujibuchi, W., Kato, T., 2007. Classification of
heterogeneous microarray data by maximum entropy
kernel. BMC Bioinformatics 8, 267-277.
Cho, S., Ryu, J., 2002. Classifying gene expression data of
cancer using classifier ensemble with mutually
exclusive features. Proceedings of the IEEE, 90(11),
1744-1753.
Bashir, M., Lee, D., Li, M., Bae, J., Shon, H., Cho, M.,
Ryu, K., Trigger Learning and ECG Parameter
Customization for Remote Cardiac Clinical Care
Information System. IEE Transactions on Information
Technology in Biomedicine, 16, 561-571.
Cho, S., Won, H., 2007. Cancer classification using
ensemble of neural networks with multiple significant
gene subsets. Applied Intelligence 26, 243-250.
EnsembleMethodforPredictionofProstateCancerfromRNA-SeqData
55