Hybrid Learning System-Based Dental Caries Detection in X-Ray

Images: Comparing Accuracy with Support Vector Machine

R. Vijay and G. Ramkumar

Department of ECE, Saveetha School of Engineering, SIMATS, Chennai, Tamil Nadu, India

Keywords: Novel Hybrid Learning System, Support Vector Machine, Deep Learning, Caries Detection, Accuracy,

Biomedical, Dental.

Abstract: The primary objective of this study is to conduct a comparison between the accuracy of Support Vector

Machines (SVM) and a Novel Hybrid Learning System (Novel HLS) for the detection of dental caries in

dental photos obtained from a dedicated dataset. In this investigation, a total of 86 samples were gathered and

divided into two distinct groups. Specifically, Group 1 comprised 43 samples that were processed using the

Novel HLS approach, while Group 2 consisted of 43 samples that underwent processing with the SVM

method. The dataset was imported as per the research protocol, and the Novel HLS code was developed

employing Google Colab software. To determine the sample size, an online statistical analysis tool was

employed, aiming for an 80% pretest power and an alpha value of 0.05. The sample size was calculated based

on prior research findings. Results revealed that SVM achieved an accuracy rate of 70.816%, while the novel

HLS method demonstrated a significantly higher accuracy of 97.221%. A statistical significance level of

0.012 (P < 0.05) indicated that there exists a noteworthy disparity in accuracy between the two methods. The

dataset substantiates the observation that the Novel HLS approach outperforms SVM by a significant margin

in terms of its predictive capabilities for dental caries detection.

1 INTRODUCTION

A condition affecting millions of individuals is

known as dental caries, which entails the gradual

deterioration of tooth structure. The terms "normal,"

"mild," "moderate," or "severe" dental caries denote

the extent to which the condition has progressed

(Machiulskiene 2019). "Normal" dental caries

signifies the initial stage of the condition. Detecting

dental caries at an early stage might obviate the need

for more invasive surgical procedures, resulting in

substantial long-term savings. In the realm of

biological applications, bitewing radiography is

considered the preferred approach for identifying

demineralized proximal caries. Such caries are

notoriously challenging to diagnose using clinical

methods alone (Abzenada 2019). Combining

bitewing radiography with a comprehensive visual

examination can facilitate the relatively

straightforward diagnosis of proximal caries.

Additionally, technologies like fibre optic

transillumination and DIAGNOdent, which are based

on fluorescence, offer alternative means for detecting

dental cavities.

The decayed missing filled teeth index (DMFT) is

a pivotal metric for assessing caries-related

conditions, relying on demographic data (Abuzenada

2019, Irfan 2020). This index allows the

determination of the proportion of permanent teeth

affected by caries. Recognizing that a variety of

factors, including inadequate oral hygiene practices,

poor dietary habits, dental interventions, and financial

constraints, can influence oral health, establishing the

DMFT and understanding associated risks becomes a

crucial initial step in constructing personalized oral

preventive strategies.

Article 9 of the legislation governing oral health in

Korea mandates the implementation of surveys

concerning the biomedical oral health of children (Hu

et al. 2014). These surveys are conducted within

Korea.

Previous studies have demonstrated that a total of

5880 papers from the biomedical survey have been

published on IEEE Xplore since 2021, each offering

distinct advantages. Within this context, it has been

observed that 5880 articles related to the biomedical

survey have been made available on IEEE Xplore.

While methods relying on electrical resistance and

teeth self-fluorescence seem most promising for

122

Vijay, R. and Ramkumar, G.

Hybrid Learning System-Based Dental Caries Detection in X-Ray Images: Comparing Accuracy with Support Vector Machine.

DOI: 10.5220/0012572000003739

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 1st International Conference on Artiﬁcial Intelligence for Internet of Things: Accelerating Innovation in Industry and Consumer Electronics (AI4IoT 2023), pages 122-127

ISBN: 978-989-758-661-3

accurately detecting early stages of enamel

demineralization, it's worth noting that the dataset

contains 5880 articles from the biomedical survey as

well.

For this particular research, the creation of train

and test datasets was undertaken by researchers using

a dataset comprising 3000 periapical radiography

images, divided in an 80:20 ratio. This split ratio was

implemented using a GoogleNet Inception v3 CNN

network, previously trained (Prakash et al. 2019), for

pre-processing and transfer learning. A

comprehensive assessment encompassing unique

accuracy, reactivity, specificity, positive and negative

predictive values, area under the curve (AUC), and

ROC was carried out for both observation and

separate DCNN algorithm execution, determined

through a sequence of calculations (Loan et al. 2022).

The distribution of the 3000-image collection

indicated that premolars were present in 25.9% of the

maxilla and molars in 25.6% of the mandible. Based

on diagnosis, the same dataset was categorized into

non-dental caries (premolars: 26.1%, molars: 24.3%)

and dental caries (premolars: 23.9%, molars: 25.7%).

Notably, caries originating outside the teeth were

more prevalent in premolars compared to those

arising within the teeth.

Subsequently, the entire image collection was

resized to dimensions of 299 by 299 pixels and stored

in JPEG format (Almasri et al. 2019). In our

technologically advanced society, X-rays find diverse

applications, and in the context of this article, we will

limit the discussion to their significance in medicine.

The interpretation of X-rays holds particular

importance in disease prevention and diagnosis due to

its potential for unveiling concealed abnormalities. X-

rays have been a vital tool in medical imaging since

Rontgen's discovery of their ability to differentiate

various bone structures (Bowling et al. 2002).

The presence of noise poses challenges in current

biological data processing approaches, and the

research's core aim is to employ Novel HLS for the

detection of dental caries in X-ray images, enhancing

accuracy, and subsequently comparing the outcomes

with those obtained through the utilization of SVM.

2 MATERIALS AND METHODS

Each category is composed of a total of 43 distinct

examples for selection. Group 1 samples were

generated through the utilization of the unique HLS

methodology for training, while Group 2 samples

were trained using the well-established SVM

classifier. Both training methodologies were

harnessed to compose the samples.

The research is being conducted using a computer

equipped with a 1024 by 768 pixel resolution screen,

a 64-bit central processing unit, and 8 gigabytes of

random access memory. The compilation of the

Novel HLS code was executed using the Google

Colab platform. Once the program was made publicly

available, a training session was conducted on the

dataset pertaining to dental caries. Subsequently,

testing was carried out using the trained data. A

comparison was drawn between the accuracy

achieved by the Novel HLS and the accuracy attained

by the currently employed SVM classifier.

The evaluation of performance hinges on the

accuracy values obtained through the investigation.

Upon completing the analysis, the dataset underwent

data visualization. Following this stage,

preprocessing of the dataset occurred, involving the

removal of any erroneous or noisy data it may

contain. The ultimate step involves the assessment of

the findings' reliability.

A hybrid learning system designed for dental

caries detection integrates various approaches,

blending traditional image processing techniques

with modern machine learning methods. This

integration aims to enhance the accuracy and

efficiency of identifying dental caries in dental

photographs, ultimately improving the diagnostic

process. Various imaging tools such as X-rays,

intraoral cameras, and 3D scans can aid in diagnosing

dental caries, commonly known as tooth decay or

cavities. To construct a hybrid learning system for

dental caries detection, the following steps can be

outlined. Collect a diverse set of dental photographs,

including images of healthy teeth and those affected

by caries. Preprocess these images by enhancing

contrast, reducing noise, and standardizing image

quality. Accurate input data is crucial for the hybrid

learning system's effectiveness. Employ traditional

image processing methods to extract relevant features

from dental images. Techniques like edge detection,

texture analysis, and morphological operations can be

used to highlight areas of interest such as cavities or

enamel demineralization. Convert the extracted

features into a suitable format for machine learning

algorithms. This might involve vectorization or

encoding to make the data understandable by machine

learning models. Train machine learning models

using the transformed features as input. Models like

Convolutional Neural Networks (CNNs) and hybrid

architectures are effective in recognizing complex

patterns and relationships in data, enabling accurate

predictions about dental caries onset. Transfer

Hybrid Learning System-Based Dental Caries Detection in X-Ray Images: Comparing Accuracy with Support Vector Machine

123

learning can further enhance accuracy by fine-tuning

pre-trained models with dental imaging data.

Combine predictions generated by multiple machine

learning models to create an ensemble that surpasses

the performance of individual models. Ensemble

methods such as bagging and boosting can

significantly enhance the overall performance and

robustness of the caries detection system. Validate the

hybrid learning system's performance using clinical

data and annotations provided by subject matter

experts. Fine-tune the system based on feedback from

dental professionals to improve accuracy and clinical

usefulness. Integrate the developed hybrid learning

system with existing dental software or imaging

systems for seamless incorporation into dental

practices. Design a user-friendly interface that allows

dentists to submit photographs and receive diagnostic

results efficiently. Maintain an updated and enhanced

hybrid learning system by incorporating new data,

advancements in image processing and machine

learning, and feedback from dental practitioners. This

iterative process ensures the system's ongoing

performance and relevance. By combining the

strengths of both traditional image processing

techniques and modern machine learning methods, a

hybrid learning system can offer more accurate and

efficient dental caries detection, contributing to

improved patient care and diagnosis.

Figure 1: Process flow the accuracy finding using modified

Novel HLS.

Google Colab, the platform where the Novel HLS

algorithm is implemented (Acharya et al. 2018),

provides the software for utilizing the Novel HLS

algorithm. A hybrid learning system refers to the

integration of two distinct algorithms in a way that the

combined output exhibits superior accuracy. It

employs supervised deep learning techniques like

Convolutional Neural Networks (CNN) and Support

Vector Machines (SVM) for both regression and

classification tasks. CNN and SVM are the two

primary methodologies employed. The k-nearest

neighbors algorithm (KNN or k-NN) is a supervised

learning classifier that predicts the classification of a

single data point based on its neighbouring data

points. It generates predictions or classifications by

considering the outcomes of this analysis. This

approach is commonly known as KNN or k-NN. To

achieve a higher level of accuracy by combining two

distinct algorithms, it's a common approach to utilize

hybrid learning systems. These systems employ

supervised deep learning techniques, such as

Convolutional Neural Networks (CNN) and Support

Vector Machines (SVM), for both regression and

classification tasks. CNN and SVM serve as the

primary methodologies in this context.

A CNN possesses the ability to automatically

extract relevant information from its input through a

series of hierarchical convolutional layers. This

distinctive capability sets CNNs apart from other

types of neural networks. These convolutional layers

often consist of multiple filters or kernels that analyse

the input data to generate feature maps. These feature

maps highlight patterns and edges present in the input

data, whether in the form of text or images. The input

data can be sourced from text files or image files.

In a hybrid deep learning network, the

conventional CNN softmax layer is substituted with a

non-linear SVM-based classification layer. This layer

is integrated into the network structure to optimize the

utilization of acquired features and enhance overall

network stability. This modification aims to improve

the network's performance by harnessing the

strengths of both CNN and SVM techniques.

The project's proposed workflow is illustrated in

Figure 1. Google Colab plays a crucial role within this

workflow, as a specific step involves the utilization of

Colab-generated code to implement a dataset. Once

the dataset is imported and visualized, the subsequent

stage entails data preparation. In this phase, the error

figures from Google Drive are cross-referenced with

the mounted code. Following the completion of this

stage, the accuracy of the dental caries detection

system employing SVM is evaluated and juxtaposed

against the accuracy of an existing classifier referred

to as KNN.

3 STATISTICAL ANALYSIS

The validity of the proposed study and the research

methodologies utilized previously were assessed using

the SPSS software program. In this study, the mean

AI4IoT 2023 - First International Conference on Artiﬁcial Intelligence for Internet of things (AI4IOT): Accelerating Innovation in Industry

and Consumer Electronics

124

accuracy scores were the dependent variables, while

the independent variables were the caries images. The

level of significance was ascertained through the

application of a T-test for independent samples.

4 RESULTS

Fig. 2. Statistical analysis using SPSS tool to find the

accuracy of the caries dental in x-ray image Table 1

depicts a Bayesian analysis of the coefficient,

wherein accuracy serves as the dependent variable

and the model is predicated on groups. The analysis

assumes standard reference priors and a data variance

of 0.000. The table provides the mode and mean

values for the data, along with a credible interval

computed at a 95% confidence level. This interval

denotes the upper and lower limits for the groups.

Table 2 presents a contrast between the Novel HLS

and SVM classifiers. The findings reveal that the

Novel HLS classifier demonstrates a superior mean

value of 97.432, in comparison to the SVM classifier

with a mean value of 70.816, based on testing with a

group of 43. It was ascertained that the means for each

classifier exhibit distinct standard deviations.

Figure 2: Displays a bar chart comparing Novel HLS and

SVM accuracy. Novel HLS exhibits significantly higher

accuracy (approximately 97.221% + 2%) than SVM (about

70.816% + 2%), with a 95% error bar.

Table 3 showcases the execution of an

independent sample T-test for two groups. The

outcomes indicate a significant disparity between the

two groups concerning accuracy, showcasing a mean

difference of 26.3674 and a standard error difference

of 0.002280. The T-test yields a value of 115.638,

signifying that the variance between the means of the

two groups possesses statistical significance, with a

probability (P) of less than 0.05.

Table 1: Bayesian estimation of coefficient.

Groups

Mode

Posterior Mean

Variance

95% Confidence

Interval Lower

Bound

95%

Confidence

Interval

Upper bound

Novel HLS

97.651

.000

97.3

98.0

SVM

70.816

.000

70.5

71.1

Table 2: T-test compares the Novel HLS and the SVM classifier.

Groups

Mean

Std.Deviation

Std.Mean Error

Accuracy

Novel HLS

97.221

0.12893

.001966

SVM

70.816

0.7572

.001155

Table 3: Independent sample test.

Accuracy

sig.

dif

sig(2-

tailed)

Mean

diff

Std.Error

Difference

Lower

upper

Equal

Variance

Assumed

17.649

.032

115.638

.012

26.3674

.002280

25.9140

26.8209

Equal

Variance

assumed

115.638

67.890

.012

26.3674

.002280

25.9124

26.8225

Hybrid Learning System-Based Dental Caries Detection in X-Ray Images: Comparing Accuracy with Support Vector Machine

125

5 DISCUSSION

A substantial accuracy discrepancy of 97.22% was

observed between the SVM classifier and the Novel

HLS algorithm in accurate data prediction. The Novel

HLS approach outperformed the SVM classifier

notably. In contrast to the SVM classifier's accuracy

rate of 70.816%, the Novel HLS method achieved a

significantly higher accuracy rate of 97.221%. The

observed variation in accuracy holds statistical

significance, as indicated by a significance value of

0.012 (P < 0.05) derived from an independent

variable test conducted using the SPSS IBM tool.

This outcome lends weight to the inference that the

observed distinction is statistically meaningful.

Other researchers have reported similar findings,

and the goal of this study is to highlight the latest

advancements in employing neural networks for the

detection and diagnosis of dental caries. The study

delved into research on diverse aspects of neural

networks, including network types, database

attributes, and outcomes. Moreover, the assessment

explored how each study defined and categorised

caries, considering various parameters such as caries

type and the teeth examined (Nanmaran et al. 2022,

Thakur et al. 2024). A precise definition of caries and

the types of lesions under investigation is crucial

before evaluating and comparing research outcomes.

Caries refers to a form of dental decay. Studies

employing ICDAS II displayed accuracy ranging

from 80 to 88.9% (mean SD of 85.45 6.29%).

However, research that defined caries as the loss of

mineralization (radiolucent) achieved an accuracy of

97.1%. In this study, caries was defined as the loss of

mineralization. Nonetheless, 76% of the papers

assessed for this review omitted information about

caries lesion definitions. Another potential bias

source is the dataset used for training. The biomedical

images employed in training need specialist

annotations (Musri et al 2021). Seven of the analysed

studies acknowledged the involvement of examiners

in annotating images, though the level of expertise

and number of examiners varied between

investigations (Manzey et al 2006).

Studies have explored the correlation between

dental experience and caries identification. Bussaneli

et al. concluded that the examiner's expertise didn't

impact the detection of occlusal lesions in primary

teeth, but it did affect prioritization of treated lesions.

An artificial intelligence's performance is restricted

by the quality of input based on human observer

ratings. Articles in this review's scope of examiner-

assisted accuracy had a mean standard deviation of

88.7 8.55%, ranging from 80 to 97%. Results from

research involving four specialists examining images

yielded the second-best outcomes, followed by a

single examiner using standard criteria for caries

identification. Conversely, the least accurate findings

emerged from research with two different examiners

(Budd 2017). Only one study provided information

on researchers' years of expertise, but since these

findings weren't closely correlated with the total

number of examiners (Parziale 2016), it's vital to

consider other factors such as neural network usage,

dataset, and caries definition. Training images can be

time-intensive, potentially impacting accuracy in

some scenarios. This limitation of the study is

mitigated by selecting only necessary database image

features for classification, significantly reducing

training time. Consequently, the potential use of

larger datasets for research becomes viable.

6 CONCLUSION

Based on the results obtained, the Novel HLS

algorithm demonstrated superior accuracy compared

to the established SVM classifier. The research

findings clearly indicate that the Novel HLS

algorithm outperforms the SVM classifier in

accurately predicting data. The accuracy achieved by

the Novel HLS algorithm is notably higher, at

97.22%, whereas the SVM classifier achieved a

comparatively lower accuracy rate of 70.816%.

REFERENCES

V. Machiulskiene, “Nyvad Criteria(2019)for Assessment of

Caries Lesion Activity and Severity,” Detection and

Assessment of Dental Caries. pp. 35–43, 2019. doi:

10.1007/978-3-030-16967-1_5.

B. Abuzenada, (2019) “Detection of proximal caries with

digital intraoral bitewing radiography: An interobserver

analysis,” Saudi Journal for Health Sciences, vol. 8, no.

1. p. 38, 2019. doi: 10.4103/sjhs.sjhs_141_18.

N. Irfan, K. Anwar, A. Taimoor, M. A. Al Absi, M. I. Arain,

and S. Shahnaz, (2020) “Dental Caries assessment of

Rural Population of Sindh by DMFT (Decayed missing

and filled teeth) Index,” The Professional Medical

Journal, vol. 27, no. 01. pp. 100–103, 2020. doi:

10.29309/tpmj/2019.27.01.3454.

Z. Hu, X. Yan, Y. Song, S. Ma, J. Ma, and G. Zhu, (2014)

“Trends of dental caries in permanent teeth among 12-

year-old Chinese children: evidence from five

consecutive national surveys between 1995 and 2014,”

BMC Oral Health, vol. 21, no. 1. 2021. doi:

10.1186/s12903-021-01814-7.

R. M. Prakash, R. Meena Prakash, and R. Shantha Selva

Kumari, (2019)“Classification of MR Brain Images for

AI4IoT 2023 - First International Conference on Artiﬁcial Intelligence for Internet of things (AI4IOT): Accelerating Innovation in Industry

and Consumer Electronics

126

Detection of Tumor with Transfer Learning from Pre-

trained CNN Models,” 2019 International Conference

on Wireless Communications Signal Processing and

Networking (WiSPNET). 2019. doi:

10.1109/wispnet45539.2019.9032811.

L. T. Loan and F. Maviza, (2022) “W093 Specificity,

sensitivity, positive and negative predictive values of

nitrite and leukocyte esterase in predicting urinary tract

infections at FV hospital,” Clinica Chimica Acta, vol.

530. p. S316, 2022. doi: 10.1016/j.cca.2022.04.831.

M. Almasri, (2019) “Assessment of extracting molars and

premolars after root canal treatment: A retrospective

study,” The Saudi Dental Journal, vol. 31, no. 4. pp.

487–491, 2019. doi: 10.1016/j.sdentj.2019.04.011.

A. Bowling, (2002)Research Methods in Health:

Investigating Health and Health Services. 2002.

Vickram, A. S., Samad, H. A., Latheef, S. K., Chakraborty,

S., Dhama, K., Sridharan, T. B., ... & Gulothungan, G.

(2020). Human prostasomes an extracellular vesicle–

Biomarkers for male infertility and prostrate cancer:

The journey from identification to current knowledge.

International journal of biological macromolecules,

146, 946-958.

S. G. Konrad, S. Gerling Konrad, F. R. Masson, and E.

Nebot, (2018) “Analysis of accuracy of pedestrian

inertial data obtained from camera’s images,” 2018

IEEE Biennial Congress of Argentina (ARGENCON).

2018. doi: 10.1109/argencon.2018.8646201.

A. Acharya, V. Powell, M. H. Torres-Urquidy, R. H.

Posteraro, and T. P. Thyvalikakath, (2018).Integration

of Medical and Dental Care and Patient Data.

Springer, 2018.

Nanmaran, R., Srimathi, S., Yamuna, G., Thanigaivel, S.,

Vickram, A. S., Priya, A. K., ... & Muhibbullah, M.

(2022). Investigating the role of image fusion in brain

tumor classification models based on machine learning

algorithm for personalized medicine. Computational

and Mathematical Methods in Medicine, 2022.

Thakur, S., Dipra Paul, Oruganti, S. K., & Dimitrios A

Karras. (2024). Modelling of Whispering Gallery Mode

Resonators for Dielectric Measurement Applications.

SPAST Reports, 2(2).

https://spast.org/ojspath/article/view/4892

N. Musri, B. Christie, S. J. A. Ichwan, and A. Cahyanto,

(2021) “Deep learning convolutional neural network

algorithms for the early detection and diagnosis of

dental caries on periapical radiographs: A systematic

review,” Imaging Science in Dentistry, vol. 51, no. 3. p.

237, 2021. doi: 10.5624/isd.20210074.

D. Manzey, J. Elin Bahner, and A.-D. Hueper, (2006)

“Misuse of automated aids in process control:

Complacency, automation bias and possible training

interventions,” PsycEXTRA Dataset. 2006. doi:

10.1037/e577572012-003.

S. C. Budd and J.-C. Egea, Sport and Oral Health: A

Concise Guide. Springer, (2017).

L. Parziale, J. Bostian, R. Kumar, U. Seelbach, Z. Y. Ye,

and I. B. M. Redbooks, Apache Spark Implementation

on IBM z/OS. IBM Redbooks, (2016).

Hybrid Learning System-Based Dental Caries Detection in X-Ray Images: Comparing Accuracy with Support Vector Machine

127