Then we construct and train several machine learning
models, including Radial basis function model(RBF),
naive Bays model(NBC), Random forest model, to
obtain corresponding results for further analysis.
RBF (Radial Basis Function):It is composed of J.
Moody and C The neural network algorithm based on
radial basis functions proposed by Darken in 1988.
RBF neural network is a local approximation network
that can approximate any continuous or discrete
function with arbitrary accuracy (Bi et al., 2016), and
can handle rules that are difficult to analyze within
the system. It is quite effective when handle nonlinear
classification and prediction problems.
Three layer consist the neural network and
function, they are input layer, hidden layer, and
output layer. The input layer is the same as other
neural network, in the article, the input layer
represents physical and chemical test characteristic
attributes of red wine, the datasets score the
characteristic attributes of the red wine, resulting in
the confirmation for the final quality prediction. the
Its structural diagram is shown in Figure 1. As shown
in the below figure, the input layer is (X1, X2,..., Xp),
the hidden layer is (c1, c2,..., ch), and the output layer
is y, and (w1, w2,..., wm) represents the hidden layer
to the classification of red wine quality grades (Bi et
al., 2016). The output layer’s connection weights are
determined by a nonlinear function, h(x), known as a
radial basis function, utilized by each node in the
hidden layer. The primary function of the hidden
layer is to transform the vector containing low-
dimensional statistical data, p, into a high-
dimensional representation, h, which ultimately
influences the quality assessment. This
transformation enables the network to address cases
of linear inseparability in low dimensions by making
them separable in higher dimensions.
The central concept driving this process is the
kernel function, which ensures that the mapping from
input to output within the network is nonlinear, while
maintaining linearity in the network’s output with
adjustable parameters. By solving the network’s
weights directly through linear equations, the learning
process is significantly accelerated, and the risk of
getting stuck in local minima is minimized. The
activation function of a radial basis function neural
network is typically represented by a Gaussian
function.
R(x
−c
)=exp(−
ơ
||x
−c
||
) (1)
The structure of radial basis neural networks can
be obtained as follows:
𝑦
=
∑
𝜔
𝑒𝑥𝑝(−
ơ
||𝑥
−𝑐
||
)+𝑏
j=1,2,...,n
(2)
Among them, xp is the p-th input sample, ci is the
i-th center point, and h is the number of nodes in the
hidden layer. N is the number of samples or
classification outputs, and bi is the threshold of the i-
th neuron.
NBC: Naive Bayes classifier (NBC) is a very
simple classification algorithm. For the red wine
quality to be classified, under the condition of the
event occurring, the probability of occurrence for
each category is the highest, indicating which
category it is considered to belong to (Liang, 2019).
The NBC model assumes that attributes are
independent of each other, but in real data, each
attribute is correlated, which is precisely this
assumption that limits the use of the NBC model.
Bayesian method can be calculated by assuming a
prior probability and the conditional probability
obtained from observation data under a given
assumption:
P(O|) =
(|)()
()
(3)
Assuming that each data sample Y={y1, y2,..., yn}
yis a set of n-dimensional vectors with n class labels
C C C n, 1, 2.
Obtaining:
max{P(x=𝐶
|Y),P(y=𝐶
|Y),...,y=𝐶
|Y)} (4)
Transform the classification problem into a
conditional probability problem, where P (Y) is
constant for all classes, so the probability of each C
classification occurring under the condition of Yis P
(C | Y). Then, take the C at the maximum probability
as our answer and determine the class label Ci.
Because X, as a sample data, often has a large
dimension, probability of any combination of features
is usually difficult to analysis, which requires the use
of the word "naive" in naive Bayes. Assuming that the
conditions are independent of each other, the number
of parameters to be solved is greatly reduced. Only
one P (X | C) needs to be solved separately, and then
multiplied to obtain: while the prior probability can
be obtained from the training sample:
P(X|𝐶
)=
∏
𝑃(
𝑥
|𝐶
) (5)
The prior probability can be obtained from the
training samples:
P(𝐶
)=
(6)