3.3 Algorithms for Classification
The act of giving labels to test patterns based on
previously identified training patterns is known as
classification. A learning phase, in which the
classification algorithm is trained, and a
classification phase, in which the algorithm labels
fresh data, are two frequent divisions of this process.
Additionally, there are two types of machine
learning: supervised and unsupervised.
All algorithms use a single collection to store
their data, read from a file, or create a database
query. Many of the algorithms in Machine Learning
are: Ross Quinlan [11] developed the ID 3 (Iterative
Dichotomizer 3) algorithm. It's used to make a
decision tree out of a collection of facts.
C4.5 now has ID3: From a set of data containing
class labels, classification creates a model of classes.
It's also a machine learning and data mining method
that works well with categorization issues. For the
target variable's forecast. The desired distribution of
the data may be easily understood with the
assistance of a tree classification method. J48 is a
kind of ID3 that has been extended. In J48, you may
use features like missing value accounting, decision
tree pruning, continuous attribute value ranges, rule
derivation, and more. The Java version of the C4.5
method is the J48 algorithm in the WEKA data
mining tool. With the WEKA tool, you have a lot
of options when it comes to tree pruning.
Pruning can be done to fine-tune a potential
over-fitting situation. The classification is repeated in
additional algorithms until each leaf is pure; that is,
the data categorization should be as perfect as
feasible. This algorithm creates the rules that
determine the data's specific identification. The
objective is to generalize a decision tree
progressively until it reaches a balance of
flexibility and accuracy [12]. The leaves formed a
class in a decision tree node in the center of the
characteristics of the data being tested, and the
branch is the outcome of the test attributes (records)
[13].
The Bayesian method is used to estimate the
likelihood of various assumptions. Furthermore, the
simplest type of Bayesian network is Naive Bayes,
in which all attributes are independent of the class
variable's value [14]. Furthermore, Naive Bayes is a
straightforward method for developing classifiers,
which are models that give class labels to issue
instances represented as vectors of feature values,
with the class labels selected from a limited range of
options. There is no one method for training such
classifiers; rather, many methods based on the same
concept exist: all naive Bayes classifiers assume that
the value of one feature is independent of the value
of any other feature, given the class variable. An
apple, for instance, is a red, spherical fruit with a
diameter of about 10 cm. Regardless of any possible
connections between the color, roundness, and
diameter data, a naive Bayes classifier examines
each of these properties to contribute independently
to the likelihood that this fruit is an apple. Naive
Bayes classifiers may be learned very quickly for
certain probability models in a supervised learning
environment. The maximum likelihood technique is
utilized to estimate parameters for naive Bayes
models in many practical situations; in other words,
the naive Bayes model may be employed without
using Bayesian probability or any Bayesian
processes [20].
CART stands for Classification and Regression
Tree. It's a way of making a binary decision tree
with two branches for each node.
By defining the category of test documents, the
K-NN method is used to evaluate the degree of
similarity between documents and k training data
and store a specific quantity of classification data.
This technique is an instant-based learning algorithm
that categorizes objects using the training set's
nearest feature space. The training sets are
represented in a multidimensional feature space. The
training set's category is used to divide the feature
space into regions. If the most common category
among the k closest training data, a point in the
feature space is allocated to that category. In most
cases, Euclidean Distance is employed to calculate
the distance between the vectors. The availability of
a similarity metric for finding neighbors of a given
document is a crucial component of this technique.
The training step consists solely of storing the
training set's feature vectors and classifications.
Distances between the new vector, which represents
an input document, and all stored vectors are
computed in the classification phase, and the k
closest samples are chosen. The closest point
allocated to a specific category is used to forecast
the annotated category of a document [21].
For our research, we utilized the J48
classification method, which is excellent for high
accuracy from the dataset sections in [14].
Furthermore, in [22], it has the greatest classification
accuracy (80.46%) for predicting a user's approval
of re-orientation systems. This technique applies to
discrete data, like in our instance of predicting
workplace for the new database HCP 2019 [1]
utilizing j48 Machine Learning Algorithms.