1. Encode the position for each particle as c hy-
perclusters h
j
, j = 1, ..., c, then initialize partition
matrix U and first generation of particles.
2. Start the evolution under the current partition ma-
trix U
t
and fitness function J. The PSO updates
position and velocity for each particle. Evolution
continues until an optimal particle that minimizes
the objective function J under the current partition
matrix U
t
is found. Then we get the c hyperclus-
ters h
t+1
j
, j = 1, ..., c.
3. After we get the c hyperclusters h
t+1
j
, we then cal-
culate the new fuzzy partition matrix for each data
point, the updated fuzzy partition matrix U
t+1
would minimize the objectivefunction J under the
current fuzzy hyperclusters.
4. If the algorithm converges or reaches the max-
imum iteration numbers, the computation stops.
Otherwise, go to Step 2.
The convergence condition is similar to that of the
iterative numerical solution.
4 CONCLUSIONS
We have presented a proposed fuzzy hyper-clustering
algorithm for pattern classification in microarray
gene expression data. We formulated the objective
function for the proposed hyper-clustering and dis-
cussed possible solutions using numerical and nature-
inspired optimization methods. The proposed cluster-
ing method can be: 1) suitable for overlapping data
samples as fuzzy membership is utilized; 2) compu-
tationally efficient as the calculation for hyperclus-
ters may use generalized eigenvalue decomposition
which is simpler than that in standard SVMs; 3) po-
tential to handle nonlinear data as a kernelized ver-
sion of the proposed method can take advantage of
the kernel trick for nonlinear data analysis; 4) suit-
able for high dimensions small sample sizes data sets
as the supervised version of the proposed method can
be viewed as a variant of SVMs which currently is
known as the best high-dimension small-sample prob-
lem solver. Furthermore, the proposed approach can
be applied to many other different areas, not only con-
fined to microarray gene expression analysis.
REFERENCES
Asyali, M. H. and Alci, M. (2005). Reliability analysis of
microarray data using fuzzy c-means and normal mix-
ture modeling based classification methods. Bioinfor-
matics, 21:644–649.
Baken, K. A., Pennings, J. L., Jonker, M. J., Schaap, M. M.,
de Vries, A., van Steeg, H., Breit, T. M., and van Lov-
eren, H. (2008). Overlapping gene expression pro-
files of model compounds provide opportunities for
immunotoxicity screening. Toxicology and Applied
Pharmacology, 226:46–59.
Bradley, P. S. and Mangasarian, O. L. (2000). k-plane clus-
tering. J. Global Optimization, 16:23–32.
Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction
to Support Vector Machines and Other Kernel-based
Learning Methods. Cambridge University Press,
Cambridge.
Ding, C. and Peng, H. (2003). Minimum redundancy fea-
ture selection from microarray gene expression data.
In Proc. 2003 IEEE Computer Society Bioinformatics
Conference, pages 523–529.
Dougherty, E. R., Barrera, J., Brun, M., Kim, S., Cesar,
R. M., Chen, Y., Bittner, M., and Trent, J. M. (2002).
Inference from clustering with application to gene-
expression microarrays. J. Computational Biology,
9:105–126.
Feng, H. M., Chen, C. Y., and Ye, F. (2006). Adap-
tive hyper-fuzzy partition particle swarm optimiza-
tion clustering algorithm. Cybernetics and Systems,
37:463–479.
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasen-
beek, M., Mesirov, J. P., Coller, H., Loh, M. L., Down-
ing, J. R., Caligiuri, M. A., Bloomfield, C. D., and
Lander, E. S. (1999). Molecular classification of can-
cer: Class discovery and class prediction by gene ex-
pression monitoring. Science, 286:531–537.
Jayadeva, Khemchandaniand, R., and Chandra, S. (2007).
Fuzzy multi-category proximal support vector classi-
fication via generalized eigenvalues. Soft Computing,
11:679–685.
Pham, T. D. (2005). An optimally weighted fuzzy k-NN al-
gorithm. In Proc. 2005 Int. Conf. Advances in Pattern
Recognition, pages 239–247.
Pham, T. D., Wells, C., and Crane, D. I. (2006). Analysis
of microarray gene expression data. Current Bioinfor-
matics, 1:37–53.
Statnikov, A., Aliferis, C. F., Tsamardinos, I., Hardin, D.,
and Levy, S. (2005). A comprehensive evaluation
of multicategory classification methods for microar-
ray gene expression cancer diagnosis. Bioinformatics,
21:631–643.
Suzuki, T., Hashimoto, S.-i., Toyoda, N., Nagai, S., Ya-
mazaki, N., Dong, H.-Y., Sakai, J., Yamashita, T.,
Nukiwa, T., and Matsushima, K. (2000). Comprehen-
sive gene expression profile of LPS-stimulated human
monocytes by SAGE. Blood, 96:2584–2591.
Yang, X., Chen, S., Chen, B., and Pan, Z. (2009). Proxi-
mal support vector machine using local information.
Neurocomputing, in-print.
BIOSIGNALS 2010 - International Conference on Bio-inspired Systems and Signal Processing
418