Table 1. Prior data sets.
k1 K2 K3 K4 K5 Classification
0 1 3 0 1
C1, C2,
3C
1 2 1 0 1
1C
,
2C
,
3C
0 0 0 4 1 C1, C2, C3
1 3 1 0 2
1C
, C2,
3C
1 0 2 0 4
1C
,
2C
,
3C
0 1 2 4 4
C1, C2,
3C
2 0 1 4 3 C1, C2, C3
2 3 3 1 1
C1,
2C
, C3
2 4 4 0 2
C1, C2,
3C
2 2 1 1 0
1C
,
2C
, C3
0 1 3 4 1
1C
, C2,
3C
1 2 3 1 4 C1, C2, C3
4 1 0 1 2
1C
, C2,
3C
1 2 0 1 4
C1, C2,
3C
3 0 1 2 2
C1,
2C
,
3C
To find the best classification for a given new
data (k1=1, k2=0, k3=3, k4=1, k5=4), prior
probabilities and conditional probabilities must be
derived so that they can be applied to the above
probability calculations in equations (1) and
(4).Based on the data in table 1, the prior
probabilities of classification C1, C2, C3,
1C
,
2C
,
3C
and are respectively:
Among the 45 sample data, 9 belong to C1 class.
Since no specific attribute values are checked, the
prior probability of C1 class is 9/45, that is, P (C1) =
9/45.Of the 45 sample data,
Among the 45 sample data, 10 belong to C2
class. Since no specific attribute values are checked,
the prior probability of C2 class is 10/45, that is, P
(C2) = 10/45.
Among the 45 sample data, 5 belong to C3 class.
Since no specific attribute values are checked, the
prior probability of C3 class is 5/45, that is, P (C3) =
5/45.
Among the 45 sample data, 6 belong to the class.
Since no specific attribute values are checked, the
prior probability of the class is 6/45, that is, P (
1C
)
= 6/45.
Among the 45 sample data, 5 belong to the class.
Since no specific attribute values are checked, the
prior probability of the class is 5/45, that is, P (
2C
)
= 5/45.
Among the 45 sample data, 10 belong to the
class. Since no specific attribute values are checked,
the prior probability of the class is 10/45, that is, P (
3C
) = 10/45.
Table 2. Classify subset data for C1.
K1 K2 K3 K4 K5 Classification
0 1 3 0 1 C1
0 0 0 4 1 C1
0 1 2 4 4 C1
2 0 1 4 3 C1
2 3 3 1 1 C1
2 4 4 0 2 C1
1 2 3 1 4 C1
1 2 0 1 4 C1
3 0 1 2 2 C1
Among the 9 sample data belonging to C1 class,
2 of them have k1=1.In the case of C1 classification,
the probability of k1=1 is 2/9., P (k1=1|C1) = 2/9;
Among the 9 sample data belonging to C1 class,
3 have k2 =0.K2 is equal to 0 with a probability of
3/9 in C1 classification.K2 = 0, |, C1 = 3/9;
Among the 9 sample data belonging to C1 class,
3 have k3=3.The probability that k3 is equal to 3,
given the C1 classification, is 3/9. P (k3= 3|C1) =
3/9;
Among the 9 sample data belonging to C1 class,
3 have k4=1.The probability that k4 is equal to 1 is
3/9 in the case of C1 classification. P (k4= 1|C1) =
3/9;
Among the 9 sample data belonging to C1 class,
2 have k5=4.The probability that k5 is equal to 4 is
3/9 in the case of C1 classification, P (k5= 4|C1) =
3/9;
For classification C2, C3,
1C
,
2C
,
3C
and
respectively according to the prior probability
calculation method of C1 classification, the
respective results can be obtained through python
programming.
Through the comparison of the five validity
degrees, it is concluded that the best fitting
classification of k1=1, k2=0, k3=3, k4=1, k5=4 is
(C1,
2C
, C3).However, after removing the
denominator from the posterior probability
calculation, these three values no longer represent
probabilities. They are simply to support the relative
comparison, the extent to which a given block of
data is suitable for every possible classification.