context of three NASA datasets. Our clustering algo-
rithm was developed to discover two groupings based
on some software complexity metrics. The proposed
methodology has been shown to outperform Gaussian
mixture model as a classical approach and our offered
solution achieved better results in terms of data mod-
eling capabilities and clustering accuracy. The second
application was spam detection using the spam base
dataset from the UCI repository. The ultimate goal of
our extensive study is developing a powerful classi-
fier as a devoted filter to accurately distinguish spam
emails from legitimate emails in order to improve the
blocking rate of spam emails and decrease the mis-
classification rate of legitimate emails. Spam filtering
solutions presented in this paper generates acceptable,
accurate results in comparison with Gaussian mixture
model as the results of our algorithm has higher pre-
cision and recall. From the outcomes, we can infer
that the multivariate Beta mixture model could be a
competitive modeling approach for the software de-
fect and spam prediction problems. In other words,
we can say that our model produces enhanced clus-
tering results largely due to its model flexibility.
A Probabilistic Approach based on a Finite Mixture Model of Multivariate Beta Distributions