# Document Clustering based on Genetic Algorithm using D-Individual

### Lim Choen Choi, Soon Cheol Park

#### Abstract

Document clustering using genetic algorithm shows good performance. However the genetic algorithm has problem of performance degradation by premature convergence phenomenon. In this paper, we proposed the document clustering based on Genetic Algorithm using D-Individual (DIGA) to solve this problem. Genetic algorithm is based on the diversity of population and the capability to convergence. Success of genetic algorithm depends on these two factors. If we use these factors efficiently, we can get a better solution in reduced execution time. We apply DIGA to Reuter-21578 text collection and demonstrate the effect of our clustering algorithm. The results show that our DIGA has better performance than traditional clustering algorithms (K-means, Group Average and genetic algorithm) in various experiments.

