BOVNet: Cervical Cells Classiﬁcations Using a Custom-Based Neural

Network with Autoencoders

Diogen Babuc

and Darian Onchis¸

Computer Science Department, West University of Timis¸oara,

Blvd. Vasile P

arvan 4, Timis¸oara, Romania

Keywords:

Neural Network, Machine Learning Techniques, Expert Rules, Cervical Cells, Autoencoders.

Abstract:

Cervical cancer is a major global health challenge being the fourth-most common type of cancer. This em-

phasizes the need for accurate and efﬁcient diagnostic tools that work well for small clinical datasets. This

paper introduces an approach to computer-aided cervical scanning by integrating a custom-based neural net-

work with autoencoders. The proposed architecture, Baby-On-Vision neural network (BOVNet), is tailored

to extract intricate features from cervical images, while the autoencoders mitigate noise and enhance image

quality. State-of-the-art architectures and the BOVNet architecture are trained on three comprehensive data

sets (496, 484, and 1050 samples) that include Pap smear scans and histopathological ﬁndings. We demon-

strate the effectiveness of our approach in accurately predicting cervical cancer risk and stratifying patients

into appropriate risk categories. A comparative analysis with existing screening methods indicates the superior

performance of BOVNet in terms of sensitivity (between 90.9% and 98.81% for three data sets), general pre-

dictive accuracy (between 92% and 94.86%), and time efﬁciency in identifying the increased risk of cervical

abnormalities.

1 INTRODUCTION

One of the biggest threats to world health is cervical

cancer, especially in areas with poor access to med-

ical treatment (Tsikouras et al., 2016). The morbid-

ity and mortality rates related to cervical cancer are

still alarmingly high, despite improvements in screen-

ing programs and diagnostic methods (Bedell et al.,

2020). This highlights the need for more reliable and

easily available diagnostic tools.

Computer-aided diagnostic (CAD) technologies

have become a viable addition to conventional screen-

ing techniques in recent years, with the potential to

increase the efﬁciency and accuracy of cervical ab-

normality detection (Tekchandani et al., 2022). Exist-

ing CAD systems have certain drawbacks. Many of

them rely on oversimpliﬁed algorithms or do not han-

dle issues such as image noise, ﬂuctuation in tissue

appearance, and the subtlety of abnormalities in the

early stages (Athinarayanan et al., 2016). Anomalies

or outliers in the input data often result in poor re-

constructions compared to normal instances (Lehman

et al., 2015).

To overcome these limitations, we suggest creat-

ing the Baby-On-Vision neural network (BOVNet), a

https://orcid.org/0009-0000-5126-6480

https://orcid.org/0000-0003-4846-3752

unique CAD system designed especially for cervical

diagnosis. The goal of this system is to improve the

accuracy and reliability of cervical scans by combin-

ing autoencoders with cutting-edge machine-learning

approaches. BOVNet aims to provide a reliable and

adaptable tool for medical professionals.

The three major objectives of this study are:

1. A summary of the reasoning for the creation of the

suggested model, offering insights into the difﬁ-

culties in diagnosing cervical cancer and the ways

CAD systems may be able to help with these dif-

ﬁculties;

2. Thorough rundown of BOVNet’s features and

components, emphasizing its novel methodol-

ogy and potential beneﬁts for enhancing cervical

health outcomes;

3. Proof of BOVNet’s efﬁcacy and dependability as

a useful addition to the toolkit for diagnosing cer-

vical cancer, which will ultimately help with early

identiﬁcation, individualized care, and better pa-

tient outcomes.

These targets act as a road map for developing,

implementing, and evaluating the recommended CAD

cervical analysis principle (Figure 1).

Babuc, D. and Onchi¸s, D.

BOVNet: Cervical Cells Classiﬁcations Using a Custom-Based Neural Network with Autoencoders.

DOI: 10.5220/0013201500003938

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 11th International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE 2025), pages 173-180

ISBN: 978-989-758-743-6; ISSN: 2184-4984

173

Reasoning for the creation of an adjusted model

Proof of BOVNet’s efﬁcacy

Emphasizing the novel methodology

Figure 1: Objectives’ diagram for the cervical cells classiﬁ-

cations.

2 BACKGROUND INFORMATION

AND RELATED WORKS

2.1 Important Image Features

The degree of detail, obtained in an image, is re-

ferred to as its resolution, and it is commonly ex-

pressed in pixels per unit area, such as pixels per

inch or millimeter (Sabottke and Spieler, 2020). More

speciﬁc information is provided by higher-resolution

photographs, which is advantageous for identifying

minute anomalies. The way colors are portrayed in

a picture depends on its color space, which can affect

how the properties of the cervical tissue are analyzed

(Wang et al., 2020).

The contrast of an image is the variation in bright-

ness between its various components. Different tissue

types and anomalies are easier to discern from one

another on high-contrast images (Zhang et al., 2020).

The clarity of edges and details in a picture is referred

to as sharpness, sometimes called image sharpness.

Sharper images enable clearer visibility of the fea-

tures of cervical tissue, which is necessary for precise

analysis (Li et al., 2021). Image noise can obfuscate

crucial information and compromise the precision of

diagnostic algorithms. Improving image quality re-

quires evaluating and lowering noise levels using pre-

processing methods such as denoising ﬁlters.

The spatial arrangement of the pixel intensities

in an image is called texture, and it tells us some-

thing about the surface properties of the cervical tis-

sue (Chen et al., 2022). Finding unusual patterns or

anomalies might be aided by analyzing textural prop-

erties. The size of an object within a picture on a ref-

erence scale is referred to as its scale. Comprehending

the magnitude of the characteristics of the cervical tis-

sue is crucial to measuring irregularities and contrast-

ing images of various patients or imaging techniques

(Rahaman et al., 2021).

The uniformity of pixel intensities inside an im-

age is measured by homogeneity. Whereas regions of

poor homogeneity may indicate the presence of ab-

normalities or lesions, areas of high homogeneity may

indicate normal tissue. The geometric properties of

the objects in the image, such as size, symmetry, and

irregularity, are described by shape features (Attallah,

2023). The directionality or alignment of texture pat-

terns within an image is referred to as texture orienta-

tion. Evaluating texture orientation can help identify

abnormalities and reveal information about how cer-

vical tissue structures are organized.

2.2 Autoencoders

Figure 2: Abnormal carcinoma in situ for cervical cell num-

ber 5749-001.

In cervical diagnosis, autoencoders can be quite

useful, especially when used in conjunction with

computer-aided diagnostic systems such as BOVNet.

From cervical images, autoencoders may effec-

tively extract meaningful features that capture perti-

nent information necessary for a precise diagnosis.

Autoencoders allow for the identiﬁcation of subtle

patterns and anomalies that would not be visible with

typical image-analysis approaches (Khamparia et al.,

2021).

Due to their high pixel count and intricate spa-

tial information, cervical images are frequently high

dimensional. By mapping the high-dimensional in-

put space to a lower-dimensional latent space, autoen-

coders can achieve dimensionality reduction, keeping

the most important information while removing un-

necessary or noisy features (Adem et al., 2019).

The resolution, contrast, and look of cervical im-

ages obtained in various clinical settings might vary

signiﬁcantly. The ability of autoencoders to acquire

a uniform representation of cervical images from a

variety of data sets allows diagnostic systems to gen-

eralize and adapt to a variety of heterogeneous data

sources (Nandy et al., 2020) (see Figure 2). This im-

proves the resilience and suitability of the system for

use in actual clinical settings.

The authors of (Adem et al., 2019) investigate two

primary categories of autoencoder-based methods for

cervical diagnosis utilizing Pap smear images: vari-

ational autoencoders (VAE) and denoising autoen-

coders. By reconstructing clear images from noisy

ICT4AWE 2025 - 11th International Conference on Information and Communication Technologies for Ageing Well and e-Health

174

input, denoising autoencoders improve image qual-

ity and enable more precise feature extraction (Bodin

et al., 2017). Conversely, VAE generates images more

reliably and ﬂexibly by learning a probabilistic latent

space representation of the input images.

The intricate and multidimensional nature of med-

ical imaging data can provide overﬁtting problems

(owing to irrelevant patterns) for autoencoders used

for cervical cell categorization (Xue et al., 2021).

Moreover, class imbalances in medical data and a lack

of labeled data sets might increase the likelihood of

overﬁtting and impair the model’s capacity for suc-

cessful generalization (Corlan et al., 2023). Careful

regularization methods, data augmentation plans, and

model validation methodologies adapted to the unique

properties of cervical cell images are needed to ad-

dress these problems (Adem et al., 2019).

2.3 Related Works

In the paper (Hussain et al., 2020), authors obtained

a ResNet-50 accuracy value of 91.78% for the test-

ing data, a VGG-16 accuracy value of 87.16%, and

an AlexNet accuracy of 82% for liquid-based cytol-

ogy (AN, 2004) data set. For the conventional data

set (AN, 2004), the authors obtained 92% for ResNet-

50, 87% for VGG-16, and 82% for AlexNet. For the

complete Herlev’s benchmark (pap, 2024), they ob-

tained 89.37% for ResNet-50, an accuracy of 83.37%

for VGG-16, and 80% for AlexNet. The sensitivity,

precision, and average accuracy of all models were

between 79% and 97% on Herlev’s data sets. But

they did not approach the individual data sets and their

outcomes, for a more performant prediction and, af-

terwards, classiﬁcation.

In the article (Park et al., 2021), authors deter-

mined an area under the ROC curve (ROC-AUC) of

97% for ResNet-50, precision was around 93%, sen-

sitivity around 89%, and accuracy of 91% for data set

described in this paper but not listed. The authors

used a 5-fold cross-validation for calculating all the

evaluation metrics. This paper made a comparison

between ResNet-50 and some shallow models, such

as Extreme Gradient Boost (Chen et al., 2015), Sup-

port Vector Machine (Hearst et al., 1998), and Ran-

dom Forest (Breiman, 2001) but did not analyze other

potentially performant deep learning models such as

AlexNet or VGG-16.

The authors of (Kudva et al., 2020) received a hy-

brid model architecture between AlexNet and VGG-

16 with an accuracy value of 91.46% using the data

set described in (Ribeiro et al., 2016). Accuracy

for AlexNet was 84.31%, sensitivity of 93.50%, and

speciﬁcity of 75%. For the VGG-16 individually, they

achieved 84.15% accuracy, a sensitivity of 83.13%,

and speciﬁcity of 85.18%. Their hybrid model outper-

formed both deep learning architectures also in terms

of sensitivity and speciﬁcity, with 89.16% sensitivity

and 93.83% speciﬁcity.

3 OUR APPROACH

In this section, we will discuss the preprocessing stage

of selected data sets, the model’s construction, and

executed experiments.

3.1 Analysis of the Constructed Data

Sets

Carcinoma in situ and normal columnar refer to dif-

ferent types of cervical cell samples in the ﬁrst data

set (DS1) with 496 observations provided (300 in situ

and 196 normal columnar cells). We modiﬁed the cost

function to penalize misclassiﬁcations of the minority

class more heavily. The same procedure was applied

on other two data sets. This encouraged the model to

focus more on learning the minority class. We com-

bined these two cell types because we want to pro-

vide the obvious difference between cells such that

our model can classify properly the observations from

the ﬁrst constructed data set.

There are 484 observations in this second

constructed data set (DS2), which contain nor-

mal/intermediate and superﬁcial cells. There are 148

superﬁcial cancerous cell instances. Cells from this

data set can provide important information about the

condition of the cervix and are often evaluated during

cervical screenings. Using these types of cervical cell

data in the data set, we can train classiﬁcation models

to differentiate between normal/intermediate and su-

perﬁcial cell types based on their morphological and

pathological features. This enables the development

of automated systems for cervical cell classiﬁcation,

aiding in the early detection and diagnosis of cervical

abnormalities and diseases.

In the third updated data set (DS3) with 1050 ob-

servations, abnormal dysplastic cells are categorized

into two subtypes: light/moderate (364 + 292 sam-

ples) and severe (394 samples). By categorizing dys-

plastic cells into these subtypes, the data set provides

a more nuanced understanding of the severity of cel-

lular abnormalities present in cervical samples. Re-

searchers can use this information to train classiﬁca-

tion models to differentiate between different grades

of dysplasia.

BOVNet: Cervical Cells Classiﬁcations Using a Custom-Based Neural Network with Autoencoders

175

3.2 Model Construction

BOVNet’s architecture, with its series of convolu-

tional and pooling layers, is well-suited for extracting

hierarchical features from images. This hierarchical

feature extraction capability is crucial for distinguish-

ing between different types of cervical cells, which

may exhibit subtle variations in appearance. The

ReLU activation function used throughout BOVNet

introduces non-linearity into the model, enabling it

to learn complex decision boundaries between dif-

ferent cell types. This is important for handling the

potentially non-linear relationships present in cervi-

cal cell images (Corlan et al., 2024). The inclusion

of a rule-based layer in BOVNet provides the ﬂex-

ibility to incorporate domain-speciﬁc knowledge or

constraints into the classiﬁcation process. This can

be particularly valuable in medical diagnosis tasks

where certain rules or guidelines are established by

experts (Babuc et al., 2024). The loss of Tversky focal

points emphasizes the importance of correctly classi-

fying difﬁcult or misclassiﬁed examples by introduc-

ing a focal parameter that controls the weight of hard

examples (Abraham and Khan, 2019).

BOVNet process begins with the input shape spec-

iﬁcation and continues through several convolutional

layers to improve the network’s capacity to extract

hierarchical information from input pictures. Sub-

sequently, a convolutional autoencoder component is

shown, which consists of an encoder for input data

compression into a latent space and a decoder for data

reconstruction. This addition highlights the model’s

complex feature learning. Through this process, au-

toencoders play a crucial role in dimensionality re-

duction, facilitating the extraction of meaningful rep-

resentations from the data. This procedure will cap-

ture the essential features of the images in a lower-

dimensional space. Modiﬁcations to the latent space

involve altering its distribution to enhance the gener-

ative capabilities of the model through regularization

methods. The network’s interpretability and general-

ization skills are further enhanced by the integration

of a layer normalization process.

The ﬁrst layer of this architecture is convolutional

and applies 64 ﬁlters to the input image and uses the

ReLU activation function. Let I be the input image, W

be the ﬁlter weights, b be the bias, and σ be the ReLU

activation function (see (1)). The output feature map

O is computed as:

O = σ(W · I + b) (1)

This layer extracts 64 different features from the input

image using convolution. ReLU is chosen as the acti-

vation function to introduce non-linearity and sparsity

to the network. The next layer performs max pooling

with a pool size of 2 × 2 and a stride of 2 × 2. Max

pooling operation selects the maximum value within

each 2 × 2 region of the input feature map. Max

pooling reduces the spatial dimensions of the feature

maps, leading to translation invariance and compu-

tational efﬁciency. Similar to the ﬁrst convolutional

layer, this layer applies 128 ﬁlters with ReLU activa-

tion. Increasing the number of ﬁlters allows the net-

work to learn more complex features from the input.

The same principle is applied for following convolu-

tional and max pooling layers. The Flatten layer ﬂat-

tens the 3D feature maps into a 1D vector, preparing

them for input into the fully connected layers. This is

a necessary step in transitioning from convolutional

layers to fully connected layers.

Fully connected layer with 128 neurons and ReLU

activation follows after the Flatten layer (see Fig-

ure 3). This layer introduces non-linearity and learns

high-level representations of the features extracted by

convolutional layers. After the output of the dense

layer is obtained, it is concatenated with directly fed

into the rule-based layer. The rule-based layer eval-

uates the input data based on predeﬁned rules and

makes a decision: malignant or benign (with RB

Layer M/B). This decision is combined with the dense

layer (through weighted combination) to produce the

ﬁnal classiﬁcation outcome. Focal Tversky loss can

lead to better performance, especially in tasks where

class imbalance and minimizing false negatives are

important considerations (Abraham and Khan, 2019).

Before all convolutional layers of BOVNet, we in-

troduced an encoder module consisting of convolu-

tional layers followed by max-pooling layers. The

encoder compresses the input cervical cell images

into a lower-dimensional latent space representation.

The output of the encoder serves as the input to both

BOVNet and the decoder module. After the fully con-

nected layers of BOVNet, we added a decoder mod-

ule comprising convolutional transpose layers. The

decoder aims to reconstruct the original input images

from the latent space representation learned by the en-

coder. Reconstruction loss guides the learning pro-

cess, encouraging the autoencoder to capture mean-

ingful features in the latent space. We prefer autoen-

coders (AE) instead of VAE because the primary goal

is to learn a compact and dense representation of the

input data without explicitly modeling its probabil-

ity distribution. AE’s architecture consists of an en-

coder network that compresses the input data into a

latent representation and a decoder network that re-

constructs the original input from this representation.

This simplicity in architecture and training procedure

makes AE suitable for tasks such as dimensional-

ity reduction, feature learning, and data denoising.

ICT4AWE 2025 - 11th International Conference on Information and Communication Technologies for Ageing Well and e-Health

176

Conv2D (64, ReLU)

MaxPooling2D (2, 2)

Conv2D (128, ReLU)

MaxPooling2D (2, 2)

Conv2D (192, ReLU)

MaxPooling2D (2, 2)

Conv2D (32, ReLU)

Latent Space

Conv2D T (64, ReLU)

Conv2D T (3, Sigmoid)

Flatten

Dense (128, ReLU)

RB Layer M/B

Layer Norm.

Focal loss, 0.7, 0.3, 1

Figure 3: Components of the adjusted neural network model, BOVNet.

For the implementation we used ﬁve epochs. Using

ﬁve epochs for autoencoders is a practical choice to

balance capturing essential features while mitigating

overﬁtting, particularly for smaller datasets or sim-

pler deep models. Our main principle functions like a

teachable machine that categorizes cervical cells and

distinguishes classes for the training part. This pro-

cess continues with testing the model on the intro-

duced images. The results showed a well-constructed

model that surpasses other deep learning architec-

tures.

However, this architecture has some limitations.

Interpretability of the taught representations may suf-

fer while the model performs better in terms of clas-

siﬁcation accuracy. It can be challenging to compre-

hend the precise characteristics that the autoencoder

learned and how they apply to the classiﬁcation. Cer-

vical cell classiﬁcation may beneﬁt from consider-

ing global contextual information within the image.

BOVNet’s architecture may not effectively capture

long-range dependencies in the images, potentially

limiting its performance.

4 RESULTS AND DISCUSSION

The data shown displays the performance metrics

of several deep learning models (BOVNet, ResNet,

VGG-16, and AlexNet) on three separate data sets

with various cervical cell types. As performance eval-

uation metrics, we selected accuracy, sensitivity, pre-

cision, and ROC-AUC (Pal et al., 2021). We cal-

culated the average for each performance evaluation

metrics after running the application for 10 times.

The percentage of correctly determined instances

among all instances is known as accuracy. It is a

key performance indicator for assessing a classiﬁ-

cation model’s overall effectiveness. However, the

accuracy alone could not give a clear view of the

performance of the model in unbalanced data sets

when one class predominates over the others (nor-

mal samples outnumber abnormal samples, for exam-

ple) (William et al., 2018). Sensitivity quantiﬁes the

percentage of real positive cases (such as dysplastic

or malignant samples) that the model accurately de-

tects. In order to minimize false negatives, discover

cervical abnormalities early, and ensure that aber-

rant cases are not missed, high sensitivity is essen-

tial for cervical diagnostics (Sellamuthu Palanisamy

et al., 2022). The percentage of real negative cases

Table 1: Performance evaluation metrics for three state-of-

the-art model and the proposed model, BOVNet, obtained

for the DS1 data set.

% BOVNet ResNet AlexNet VGG

Sens. 98.81 96.39 96.05 96.3

Prec. 91.21 88.89 82.95 86.67

Acc. 94.86 92.49 89.02 91.23

AUC 93.18 94.44 94.3 93.33

(such as normal samples) that the model correctly

detects is known as speciﬁcity. When abnormalities

are absent, a low percentage of false positives is in-

dicated by high speciﬁcity, which is crucial for pre-

venting unnecessary treatments or interventions (Sell-

amuthu Palanisamy et al., 2022). The precision metric

quantiﬁes the percentage of accurately identiﬁed pos-

itive cases among all cases that the model predicts to

be positive. It illustrates the model’s capacity to pre-

vent false positives and is especially crucial in situa-

tions when incorrectly classifying positive cases may

result in serious repercussions, including suggesting

needless follow-up procedures or treatments (Sompa-

wong et al., 2019).

For the ﬁrst data set, DS1, when compared to other

models, BOVNet has the highest ROC-AUC score,

accuracy, sensitivity, and precision (Table 1). This

suggests that BOVNet distinguishes between cancer

BOVNet: Cervical Cells Classiﬁcations Using a Custom-Based Neural Network with Autoencoders

177

1 2 3 4

5 6

7 8 9 10 11 12 13 14

0.8

0.85

0.9

0.95

Data portion

Accuracy

BOVNet

ResNet-50

AlexNet

VGG-16

Figure 4: K-Fold cross-validation accuracy on 15 data portions, on DS1 data set.

100

200

300

82.44

86.18

114.3

92.41

56.7

92.41

75.2

53.45

223.4

332.4

236.8

252.3

DS1 DS2 DS3

Execution time (s)

BOVNet ResNet-50 AlexNet VGG-16

Figure 5: Execution times for models’ construction, training part, classiﬁcation, and performance evaluation metrics for DS1,

DS2, and DS3 data sets.

in situ cells and normal columnar cells with high ac-

curacy. Given its great sensitivity and precision, it

appears to be able to minimize false positives while

efﬁciently identifying true positive cases. ResNet,

AlexNet, and VGG-16 nevertheless manage to ob-

tain respectable ROC-AUC scores and accuracy. All

things considered, BOVNet seems to be the model

that performs the best on this data set, suggesting that

it is capable of reliably identifying different types of

cervical cells.

For the second data set, DS2, BOVNet contin-

ues to outperform other models in terms of accuracy

(92%), sensitivity (90.9%), and precision (93.33%).

However, its ROC-AUC score is lower compared to

the previous data set (87.85%). VGG-16 achieves

similar accuracy to BOVNet but with slightly lower

sensitivity and precision. ResNet and AlexNet show

decreased performance compared to the previous

data set, indicating potential challenges in classify-

ing intermediate and superﬁcial cell types accurately.

BOVNet still maintains its superiority in classifying

cervical cells on this data set, although the drop in

ROC-AUC suggests that it may struggle with dis-

tinguishing between intermediate and superﬁcial cell

types.

BOVNet maintains its high accuracy (94.44%),

sensitivity (91.38%), and precision (98.15%), also for

the third data set, DS3, indicating its effectiveness in

distinguishing between different dysplastic cell types.

In this data set, BOVNet demonstrates its robustness

in classifying dysplastic cell types accurately, partic-

ularly with high precision, suggesting its potential

clinical utility in identifying severe dysplastic cells,

which are critical for early intervention and treatment.

ICT4AWE 2025 - 11th International Conference on Information and Communication Technologies for Ageing Well and e-Health

178

In the context of a 15-fold cross-validation system

applied to DS1, BOVNet demonstrated exceptional

performance across all but the ﬁrst and tenth folds,

where ResNet-50 showed superior performance. No-

tably, BOVNet consistently achieved accuracy rates

ranging from 90% to 97% (see Figure 4).

Execution times were measured on a local ma-

chine equipped with an Intel Core i7, 12th generation

processor. All the models used the same setup. Com-

pared to alternative models, BOVNet often shows

shorter execution durations across all data sets (see

Figure 5). This suggests that, despite its complexity,

the architecture of BOVNet uses computing resources

rather effectively. ResNet and AlexNet consistently

exhibit higher execution times compared to BOVNet,

especially on the data sets containing a larger number

of classes. This suggests that their deeper architec-

tures and higher parameter counts result in longer in-

ference times. VGG-16 shows varied execution times

across different data sets.

The theoretical explanation for BOVNet’s effec-

tiveness lies in its ability to learn hierarchical features

through encoders, decoders, and convolutional lay-

ers, layers’ normalization and effectively fuse these

features for accurate classiﬁcation, as evidenced by

its high sensitivity, precision, and ROC-AUC scores.

However, ResNet-50 excels at the best ROC-AUC re-

sult for the ﬁrst data set, with 94.44%. ROC-AUC

offers a thorough assessment of the model’s perfor-

mance over all potential thresholds and is especially

helpful for evaluating the trade-off between sensitiv-

ity and speciﬁcity (Kanavati et al., 2022).

5 CONCLUSIONS

This research has provided signiﬁcant insights into

the ﬁeld of cervical cell classiﬁcation. By leveraging a

novel approach that combines domain-speciﬁc knowl-

edge with advanced machine learning techniques, we

have demonstrated the potential for more accurate and

efﬁcient classiﬁcation of cervical cells.

From the model’s creation, procedures, and re-

sults, BOVNet emerges as a robust and efﬁcient deep

learning architecture for the classiﬁcation of cervi-

cal cell types. Across multiple data sets and eval-

uation metrics, BOVNet consistently outperforms or

matches the performance of other well-known archi-

tectures like ResNet-50, VGG-16, and AlexNet. This

highlights its suitability for medical image analysis

tasks, particularly in the context of cervical cell clas-

siﬁcation. One notable strength of BOVNet is its

efﬁciency, as evidenced by its relatively low execu-

tion times compared to other models. This efﬁciency

makes BOVNet an attractive choice for real-world ap-

plications where computational resources are limited

or real-time inference is crucial.

However, the analysis also underscores the im-

portance of considering data set-speciﬁc character-

istics. Although BOVNet generally performs well

across various data sets, there are instances where

other models, such as ResNet-50, exhibit superior

performance.

ACKNOWLEDGEMENT

The authors thank the West University of Timis¸oara

for the resources provided and Teodor-Florin Fortis¸

for suggestions.

REFERENCES

(2024). Pap smear data sets, last accessed 8 feb 2024, data

sets.

Abraham, N. and Khan, N. M. (2019). A novel focal tver-

sky loss function with improved attention u-net for le-

sion segmentation. In 2019 IEEE 16th international

symposium on biomedical imaging (ISBI 2019), pages

683–687. IEEE.

Adem, K., Kilic¸arslan, S., and C

omert, O. (2019). Classi-

ﬁcation and diagnosis of cervical cancer with stacked

autoencoder and softmax classiﬁcation. Expert Sys-

tems with Applications, 115:557–564.

AN, C. (2004). Liquid-based cytology and conventional

cervical smears: A comparison study in an asian

screening population. Cancer, 102:200–201.

Athinarayanan, S., Srinath, M., and Kavitha, R. (2016).

Computer aided diagnosis for detection and stage

identiﬁcation of cervical cancer by using pap smear

screening test images. ICTACT Journal on Image &

Video Processing, 6(4).

Attallah, O. (2023). Cervical cancer diagnosis based on

multi-domain features using deep learning. Applied

Sciences, 13(3):1916.

Babuc, D., Ivascu, T., Ardelean, M., and Onchis, D.

(2024). Bionnica: A deep neural network architecture

for colorectal polyps’ premalignancy risk evaluation.

medRxiv, pages 2024–06.

Bedell, S. L., Goldstein, L. S., Goldstein, A. R., and Gold-

stein, A. T. (2020). Cervical cancer screening: past,

present, and future. Sexual medicine reviews, 8(1):28–

37.

Bodin, E., Malik, I., Ek, C. H., and Campbell, N. D.

(2017). Nonparametric inference for auto-encoding

variational bayes. arXiv preprint arXiv:1712.06536.

Breiman, L. (2001). Random forests. Machine learning,

45:5–32.

Chen, K., Wang, Q., and Ma, Y. (2022). Cervical optical

coherence tomography image classiﬁcation based on

BOVNet: Cervical Cells Classiﬁcations Using a Custom-Based Neural Network with Autoencoders

179

contrastive self-supervised texture learning. Medical

Physics, 49(6):3638–3653.

Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y.,

Cho, H., Chen, K., Mitchell, R., Cano, I., Zhou, T.,

et al. (2015). Xgboost: extreme gradient boosting. R

package version 0.4-2, 1(4):1–4.

Corlan, A.-S., Diogen, B., Flavia, C., and Darian, O. (2023).

Prediction and classiﬁcation models for hashimoto. In

Endocrine Abstracts, volume 90. Bioscientiﬁca.

Corlan, A.-S., Diogen, B., Flavia, C., Vlad, M., Balas, M.,

Golu, I., Amzar, D.-G., and Darian, O. (2024). An ar-

tiﬁcial intelligence system for estimating the improve-

ment of clinical and paraclinical parameters after ther-

apy in pituitary tumors. In Endocrine Abstracts, vol-

ume 99. Bioscientiﬁca.

Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., and

Scholkopf, B. (1998). Support vector machines. IEEE

Intelligent Systems and their applications, 13(4):18–

28.

Hussain, E., Mahanta, L. B., Das, C. R., and Talukdar, R. K.

(2020). A comprehensive study on the multi-class

cervical cancer diagnostic prediction on pap smear

images using a fusion-based decision from ensemble

deep convolutional neural network. Tissue and Cell,

65:101347.

Kanavati, F., Hirose, N., Ishii, T., Fukuda, A., Ichihara,

S., and Tsuneki, M. (2022). A deep learning model

for cervical cancer screening on liquid-based cytol-

ogy specimens in whole slide images. Cancers,

14(5):1159.

Khamparia, A., Gupta, D., Rodrigues, J. J., and de Al-

buquerque, V. H. C. (2021). Dcavn: Cervical can-

cer prediction and classiﬁcation using deep convolu-

tional and variational autoencoder network. Multime-

dia Tools and Applications, 80:30399–30415.

Kudva, V., Prasad, K., and Guruvare, S. (2020). Hybrid

transfer learning for classiﬁcation of uterine cervix

images for cervical cancer screening. Journal of digi-

tal imaging, 33:619–631.

Lehman, C. D., Wellman, R. D., Buist, D. S., Kerlikowske,

K., Tosteson, A. N., Miglioretti, D. L., Consor-

tium, B. C. S., et al. (2015). Diagnostic accuracy

of digital screening mammography with and without

computer-aided detection. JAMA internal medicine,

175(11):1828–1837.

Li, P., Liang, J., and Zhang, M. (2021). A degradation

model for simultaneous brightness and sharpness en-

hancement of low-light image. Signal Processing,

189:108298.

Nandy, A., Sathish, R., and Sheet, D. (2020). Identiﬁca-

tion of cervical pathology using adversarial neural net-

works. arXiv preprint arXiv:2004.13406.

Pal, A., Xue, Z., Befano, B., Rodriguez, A. C., Long, L. R.,

Schiffman, M., and Antani, S. (2021). Deep metric

learning for cervical image classiﬁcation. IEEE Ac-

cess, 9:53266–53275.

Park, Y. R., Kim, Y. J., Ju, W., Nam, K., Kim, S., and

Kim, K. G. (2021). Comparison of machine and

deep learning for the classiﬁcation of cervical cancer

based on cervicography images. Scientiﬁc Reports,

11(1):16143.

Rahaman, M. M., Li, C., Yao, Y., Kulwa, F., Wu, X., Li, X.,

and Wang, Q. (2021). Deepcervix: A deep learning-

based framework for the classiﬁcation of cervical cells

using hybrid deep feature fusion techniques. Comput-

ers in Biology and Medicine, 136:104649.

Ribeiro, E., Uhl, A., Wimmer, G., H

afner, M., et al.

(2016). Exploring deep learning and transfer learning

for colonic polyp classiﬁcation. Computational and

mathematical methods in medicine, 2016.

Sabottke, C. F. and Spieler, B. M. (2020). The effect of

image resolution on deep learning in radiography. Ra-

diology: Artiﬁcial Intelligence, 2(1):e190015.

Sellamuthu Palanisamy, V., Athiappan, R. K., and Na-

galingam, T. (2022). Pap smear based cervical cancer

detection using residual neural networks deep learning

architecture. Concurrency and Computation: Practice

and Experience, 34(4):e6608.

Sompawong, N., Mopan, J., Pooprasert, P., Himakhun, W.,

Suwannarurk, K., Ngamvirojcharoen, J., Vachiramon,

T., and Tantibundhit, C. (2019). Automated pap smear

cervical cancer screening using deep learning. In 2019

41st Annual International Conference of the IEEE En-

gineering in Medicine and Biology Society (EMBC),

pages 7044–7048. IEEE.

Tekchandani, H., Verma, S., Londhe, N. D., Jain, R. R.,

and Tiwari, A. (2022). Computer aided diagnosis sys-

tem for cervical lymph nodes in ct images using deep

learning. Biomedical Signal Processing and Control,

71:103158.

Tsikouras, P., Zervoudis, S., Manav, B., Tomara, E., Ia-

trakis, G., Romanidis, C., Bothou, A., and Galazios,

G. (2016). Cervical cancer: screening, diagnosis and

staging. J buon, 21(2):320–325.

Wang, Z., Chen, J., and Hoi, S. C. (2020). Deep learning

for image super-resolution: A survey. IEEE trans-

actions on pattern analysis and machine intelligence,

43(10):3365–3387.

William, W., Ware, A., Basaza-Ejiri, A. H., and Obun-

goloch, J. (2018). A review of image analysis and

machine learning techniques for automated cervical

cancer screening from pap-smear images. Computer

methods and programs in biomedicine, 164:15–22.

Xue, Z., Guo, P., Desai, K. T., Pal, A., Ajenifuja, K. O.,

Adepiti, C. A., Long, L. R., Schiffman, M., and

Antani, S. (2021). A deep clustering method for

analyzing uterine cervix images across imaging de-

vices. In 2021 IEEE 34th International Symposium

on Computer-Based Medical Systems (CBMS), pages

527–532. IEEE.

Zhang, L., Wang, X., Yang, D., Sanford, T., Harmon, S.,

Turkbey, B., Wood, B. J., Roth, H., Myronenko, A.,

Xu, D., et al. (2020). Generalizing deep learning for

medical image segmentation to unseen domains via

deep stacked transformation. IEEE transactions on

medical imaging, 39(7):2531–2540.

ICT4AWE 2025 - 11th International Conference on Information and Communication Technologies for Ageing Well and e-Health

180