Using C4.5 Algorithm in Classification of Asthma in Children for
Suggesting Best Possible Treatment
Ashish Jain and Swati Sharma
BSSS College, Bhopal, India b BSSS College, Bhopal, India
Keywords: Data Mining, C4.5 Algorithm, Pre-Processing, Classification, Decision Tree.
Abstract: Millions of children worldwide suffer from asthma, and finding the best therapy is critical for treating the
disease and improving the quality of life for those afflicted. Data mining is critical for detecting hidden
patterns and trends in massive datasets, such as those used in healthcare. It has been used to identify and treat
disorders including asthma. The C4.5 algorithm is a common decision tree technique that is employed in the
proposed work to build a decision tree for selecting the optimal asthma medication in children. It employs
three primary data mining steps: pre-processing, categorization, and decision tree. Finally, if the dependent
variable matched the provided conditions, the results were gathered using a decision tree. Healthcare
practitioners can make educated judgements by using data mining techniques.
1 INTRODUCTION
Data mining provides a mechanism for combining all
methodologies naturally, allowing them to emphasise
their strengths while concealing their limitations. As
more data is generated in databases, classification
analysis has emerged as a hot study area in data
mining. Today, there are numerous classification
algorithms accessible, including Decision Trees,
Bayesian classification based on statistics, neural
networks, and others (Alexander 2020).
However, it is important to apply such an
algorithm which can deal with all types of symptoms,
and thus helps in selection of the beat possible
treatments for asthma based on symptoms entered
into the database, for proper management of the
asthma treatment very well in childhood by
considering age groups such as 0-4, 4-8, and 8-12.
According to the prior idea, the article provides a
method based on a classification algorithm that makes
use of a decision tree (Zhang and Wu 2018). Human
processes are classified into two types in cognitive
psychology: primary cognition and secondary
cognition. Furthermore, the cognitive process utilises
a variety of techniques. For intricate cases or objects,
the most significant cognitive process of humans is to
first classify the items and then further cognize each
category in order to simplify the complicated things.
Similarly, while building an application-specific
algorithm for classification for asthma management
based on the number of symptoms presented in the
dataset, it is critical to be aware of the technique in
order to simplify things. After categorizing asthma
based on dependent and independent variables, there
is other classification also which is intrinsic and
extrinsic asthma, which is further subdivided into
severe and general asthma (Paul and Sherrif 2022).
Input for classification is in the form of a .csv file, in
which symptoms and their best possible Line of
Treatment are saved and retrieved from the database.
The section next describes a method of classification
of data mining that can be used to find the most
effective treatment among the many medications that
are available. It also describes how to choose the
selection variable. In particular, the classifier, testing
the options and various attribute classes, and so on. In
the final phase, the C4.5 algorithm is used to build a
decision tree and choose the best asthma therapies for
children (Berikov and Litvinenko 2020, Breiman et al
2020, Yoos and McMullen 2022).
2 CLASSIFICATION OF ASTHMA
AS PER AGE GROUPS
Asthma is a worldwide disease, and its incidence is
rising. It is predominantly a lung condition that
manifests as the following symptoms:
Jain, A. and Sharma, S.
Using C4.5 Algorithm in Classification of Asthma in Children for Suggesting Best Possible Treatment.
DOI: 10.5220/0012610400003739
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Artificial Intelligence for Internet of Things: Accelerating Innovation in Industry and Consumer Electronics (AI4IoT 2023), pages 195-198
ISBN: 978-989-758-661-3
Proceedings Copyright © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
195
Reversible airway blockage, either naturally
or by therapy
Inflammation of the airways
Hypersensitivity of the airways
Asthma is a disorder in which the airways in the
lungs tighten and swell. It is frequent in children and
teenagers. When a child has an asthma attack, his or
her lungs do not receive enough oxygen to breathe,
and they may cough or wheeze.
Based on age categories of 0-4, 4-8, and 8-12
years, the planned work will first identify asthma as
intrinsic and extrinsic, then as severe and general. It
employs a designed technique in which numerous
symptoms like as coughing, wheezing, shortness of
breath, chest tightness, and so on are recorded when
a .csv file format is formed. The classification method
is then applied to the entered data, and the
classification is completed in conjunction with the
decision tree. Having a diagnosed asthma, quantify
the symptoms over a period of time as depicted in the
Table 1.
Table 1: Asthma Symptoms.
3 ATTRIBUTE SELECTION FOR
ASTHMA DETECTION
The process of selecting attributes comprises
searching through all potential attribute combinations
in the data to discover which proportion of attributes
works best for prediction. Variable selection is a
challenging and critical topic in machine learning. In
classification jobs, it can lead to greater accuracy or
decreased computing costs.
This approach was used in this article to assign a
value to each group of attributes. The C4.5 algorithm,
which produces a decision tree, is used to determine
the value of the dependent and independent variables.
To build decision trees from a training data set, this
approach employs the concept of information gain. It
chooses symptoms as the data feature that best
divides its sample set into subsets enriched in one of
two classes.
Figure 1: Classification of asthma as per Age group.
Its criteria is the normalized information gain
(entropy difference) that occurs as a result of
choosing an attribute for data splitting. The attribute
with the largest normalized information gain is
chosen to decide.
3.1 Classification Criteria Based
Decision Tree
A decision tree is a predictive model that connects
observations about an item to conclusions about its
target value. It is also known as classification or
regression trees.
The pseudocode algorithm based on the
symptoms provided to the C4.5 algorithm for correct
and accurate decision tree is as follows:
1. Check for basic symptoms for asthma
2. For each indefinite symptom of asthma as
selection attribute A, find normalized
information gain from splitting on A
3. Let a_best be the attribute with highest
normalized information gain.
4. Create a decision node that splits on a_best
5. Recurse on the sub-lists obtained by splitting
on a_best and add those nodes as A
1
, A
2
, A
3
,
etc.
6. Those will be children of node A which has
highest information gain.
For example:
Let the calculated information gain for 5 nodes will
be, A
1
= 0.98, A
2
= 0.86, A
3
= 0.5, A
4
= 0.74, and A
5
= 0.6
Among above the node having highest information
gain will be decision node like A
1
. According to age
group as a selection attribute, the proposed work got
following results for the experiments done on
symptoms table.
AI4IoT 2023 - First International Conference on Artificial Intelligence for Internet of things (AI4IOT): Accelerating Innovation in Industry
and Consumer Electronics
196
Table 2: Classification Criteria.
Table 3: Grading Table.
4 CONCLUSION
In the twenty-first century, database and internet
technology abilities have evolved fast. Meanwhile,
a Management Information System and a Network
Data Centre have been extensively used. Data
access, data querying, and statistics all develop
throughout time. However, the high layer's decision
analysis and knowledge discovery are still
immature, resulting in the phenomena of the
"Information Explosion" and the "Knowledge
Explosion". Data mining looks to be helping to
solve these problems. Even though numerous
sectors have investigated the classification
algorithm problem, no algorithm can successfully
handle a vast volume of data while also creating a
decision tree.
Based on the findings of the studies, the
suggested paper includes one datasheet that
provides the best feasible treatment for better
asthma management in children, which is the
primary purpose of this study. (See Table III.) * For
Children above 5 years only,
** For Children below 5 years
*** Evidence to date dose not supporting using
a third long-term control medication added to
inhaled corticosteroids and acting inhaled β agonists
Using C4.5 Algorithm in Classification of Asthma in Children for Suggesting Best Possible Treatment
197
in order to avoid using systemic corticosteroid
therapy.
However, asthma can be very well managed
from childhood as it is well known that asthma
cannot be completely cured but can be controlled
through such activities.
REFERENCES
Alexander I, Morton H. (2020), An Introduction to Neural
Computing, 2nd edition Neural Networks at Pacific
Northwest National Laboratory
Zhang Wx, Wu W Z (2018), Rough Set: Theory and
Technique, Beijing Science Press
Paul E. Keller, Sherif Hashem (2022), A Novel Approach
to Modelling and Diagnosing the Cardiovascular
System, Proceedings of the workshop on
environmental and energy applications of neural
networks
Berikov, A. Litvinenko (2020), Methods for Statistical
Data Analysis with Decision Trees, Novosibirsk
Sobolev Institute of Mathematics, Methods for
statistical data analysis with decision trees, 3(12-16)
Breiman, J. Friedman, R. A. Olshen, C. J. Stone (2020),
Classification and Regression trees.
Yoos, H. L., & McMullen, A. (2022), Illness narratives of
children with asthma. Pediatric Nursing, 22 (285-290)
AI4IoT 2023 - First International Conference on Artificial Intelligence for Internet of things (AI4IOT): Accelerating Innovation in Industry
and Consumer Electronics
198