Clustering high dimensional data is still a challeng-
ing problem for fuzzy clustering algorithms because
of the concentration of distance phenomenon. Noise
and outliers in data sets additionally make the parti-
tioning of data difficult because they affect the com-
putation of cluster centers. In this work, we analyzed
two fuzzy clustering algorithms for high dimensional
data from the literature and two possibilistic versions
of the MFCM algorithm in terms of correct determin-
ing final cluster prototypes in presence of noise. Our
experiments showed that MFCM produced the most
accurate cluster centers as long as data items had real
values in few features while its possibilistic versions
PMFCM and PMFCM HDD produced quite accurate
final cluster centers independently from the number
of features in that noise points had real values.
Although the performance results for PMFCM
seem to be promising, before applying this method
on real data sets, we plan to analyze the performance
of fuzzy clustering algorithms in terms of sensitivity
to different initializations because usually we do not
have any a priori knowledge about the distribution of
data in practical applications. In our future work, we
also plan to apply other possibilistic clustering models
to MFCM to make it less sensitive to outliers. Further-
more, we aim to apply fuzzy clustering algorithms for
clustering text and image data and compare their per-
formance with common crisp clustering algorithms.
