COMPARISION OF K-MEANS AND PAM ALGORITHMS USING CANCER DATASETS

Parvesh Kumar, Siri Krishan Wasan

Abstract

Data mining is a search for relationship and patterns that exist in large database. Clustering is an important datamining technique . Because of the complexity and the high dimensionality of gene expression data, classification of a disease samples remains a challenge. Hierarchical clustering and partitioning clustering is used to identify patterns of gene expression useful for classification of samples. In this paper, we make a comparative study of two partitioning methods namely k-means and PAM to classify the cancer dataset.

References

  1. Alizadeh A.A, Eisen M.B, Davis R.E, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000;403(6769):503- 511.
  2. Bittner M, Meltzer P, Chen Y, et al. Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature. 2000;406(6795):536-540.
  3. Fayyad, M.U., Piatesky-Shapiro, G., Smuth P., Uthurusamy, R. (1996). Advances in Knowledge Discovery andData Mining. AAAI Press.
  4. Gibbons F.D, Roth F.P. Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res. 2002;12(10):1574-1581.
  5. Golub T.R, Slonim D.K, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286(5439):531-537.
  6. Guha, S., Rastogi, R., and Shim K. (1998). CURE: An Efficient Clustering Algorithm for Large Databases. In Proceedings of the ACM SIGMOD Conference.
  7. L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: an Introduction to Cluster Analysis, John Wiley & Sons, 1990.
  8. MacQueen, J.B. (1967). Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of 5th Berkley Symposium on Mathematical Statistics and Probability, Volume I: Statistics, pp. 281-297.
  9. Nielsen T.O, West R.B, Linn S.C, et al. Molecular characterisation of soft tissue tumours: a gene expression study. Lancet. 2002;359(9314):1301-1307.
  10. Yeung K.Y, Haynor D.R, Ruzzo W.L. Validating clustering for gene expression data. Bioinformatics. 2001;17(4):309-318.
Download


Paper Citation


in Harvard Style

Kumar P. and Krishan Wasan S. (2008). COMPARISION OF K-MEANS AND PAM ALGORITHMS USING CANCER DATASETS . In Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT, ISBN 978-989-8111-53-1, pages 255-258. DOI: 10.5220/0001868602550258


in Bibtex Style

@conference{icsoft08,
author={Parvesh Kumar and Siri Krishan Wasan},
title={COMPARISION OF K-MEANS AND PAM ALGORITHMS USING CANCER DATASETS},
booktitle={Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,},
year={2008},
pages={255-258},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001868602550258},
isbn={978-989-8111-53-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Software and Data Technologies - Volume 3: ICSOFT,
TI - COMPARISION OF K-MEANS AND PAM ALGORITHMS USING CANCER DATASETS
SN - 978-989-8111-53-1
AU - Kumar P.
AU - Krishan Wasan S.
PY - 2008
SP - 255
EP - 258
DO - 10.5220/0001868602550258