Comparison of Various Definitions of Proximity in Mixture Estimation

Ivan Nagy, Evgenia Suzdaleva, Pavla Pecherková


Classification is one of the frequently demanded tasks in data analysis. There exists a series of approaches in this area. This paper is oriented towards classification using the mixture model estimation, which is based on detection of density clusters in the data space and fitting the component models to them. A chosen function of proximity of the actually measured data to individual mixture components and the component shape play a significant role in solving the mixture-based classification task. This paper considers definitions of the proximity for several types of distributions describing the mixture components and compares their properties with respect to speed and quality of the resulting estimation interpreted as a classification task. Normal, exponential and uniform distributions as the most important models used for describing both Gaussian and non-Gaussian data are considered. Illustrative experiments with results of the comparison are provided.


