Bioimages Synthesis and Detection Through Generative Adversarial

Network: A Multi-Case Study

Valeria Sorgente

, Ilenia Verrillo

, Mario Cesarelli

, Antonella Santone

, Fabio Martinelli

and

Francesco Mercaldo

Department of Medicine and Health Sciences “Vincenzo Tiberio”, University of Molise, Campobasso, Italy

Department of Engineering, University of Sannio, Benevento, Italy

Institute for High Performance Computing and Networking, National Research Council of Italy (CNR), Rende (CS), Italy

{valeria.sorgente, antonella.santone, francesco.mercaldo}@unimol.it, i.verrillo@studenti.unimol.it,

Keywords:

GAN, Generative Adversarial Networks, Bioimage, Deep Learning, Classiﬁcation.

Abstract:

The rapid advancement of Generative Adversarial Networks technology raises ethical and security concerns,

emphasizing the need for guidelines and measures to prevent misuse. Strengthening systems to differentiate

real from synthetic images and ensuring responsible application in clinical settings could address data scarcity

in the biomedical ﬁeld. For these reasons, considering the increasing popularity of the possibility to generate

synthetic images by exploiting artiﬁcial intelligence, we investigate the application of Generative Adversar-

ial Networks to generate realistic synthetic bioimages for common pathology representations. We propose a

method consisting of two steps: the ﬁrst one is related to the training of a Deep Convolutional Generative

Adversarial Network, while the second step is represented by the evaluation of the bioimages quality using

classiﬁcation-based metrics, comparing synthetic and real images. The model demonstrated promising results,

achieving visually realistic images for datasets such as PathMNIST and RetinaMNIST, with accuracy improv-

ing over training epochs. However, challenges arose with datasets like ChestMNIST and OCTMNIST, where

image quality was limited, showing poor detail and distinguishability from real samples.

1 INTRODUCTION AND

RELATED WORK

In recent years, deep learning (Mercaldo et al., 2022;

Zhou et al., 2023) has become one of the most popu-

lar techniques in the ﬁeld of medical image analysis

(He et al., 2024; Huang et al., 2024), especially for

generative learning.

Among different models, Generative Adversarial

Network (GAN) is prominent for synthesizing realis-

tic medical image. This model is standout capabilities

in tasks such as generating images from textual de-

scriptions, upscaling of visual quality, and conversion

between different image styles. Because of their ver-

satility, GAN has found application in several areas

of medical imaging, which include digital pathology,

radiology and clinical neuroscience.

As a matter of fact, GAN can be used in the

biomedical ﬁeld (Mercaldo et al., 2023) for appli-

cations such as medical image generation or clinical

data simulation. One of the major challenges in clin-

ical practice is the limited availability of high-quality

labeled biomedical images needed to train deep learn-

ing models for diagnostic applications. GAN offer a

solution to this problem by generating realistic syn-

thetic medical images, thereby expanding the avail-

able datasets and improving the robustness of models.

However, these technologies present risks of fraudu-

lent uses. In particular, GAN can be used to falsify

diagnostic images, such as X-ray or MRI, aimed to

manipulate diagnoses or research outcomes. Further-

more, the generation of altered synthetic data could

compromise the validity of clinical or epidemiologi-

cal studies. These possible scenarios raise ethical is-

sues and require control strategies to ensure the in-

tegrity and veracity of biomedical data. The use of

GAN is also complicated by some issues that limit

their potential. These include difﬁculties in training

and phenomena such as mode collapse, which reduce

model’s ability to generate different and accurate data.

Numerous studies have explored the application

of GANs in biomedical ﬁelds for various purposes

(Huang et al., 2022; Zhou et al., 2021; Huang et al.,

332

Sorgente, V., Verrillo, I., Cesarelli, M., Santone, A., Martinelli, F. and Mercaldo, F.

Bioimages Synthesis and Detection Through Generative Adversarial Network: A Multi-Case Study.

DOI: 10.5220/0013228900003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 1, pages 332-339

ISBN: 978-989-758-731-3; ISSN: 2184-4305

2023; Huang et al., 2021). For example, Orlando et

al. (Orlando et al., 2018) developed a method for

generating retinal fundus images with simulated le-

sions, aiming to enhance diagnostic models, while Fu

et al. (Fu et al., 2018) used GANs to augment reti-

nal fundus image data. While GANs have been ap-

plied in biomedical domains for applications like reti-

nal vessel segmentation and liver lesion classiﬁcation,

this paper is focused on creating synthetic images that

closely resemble authentic ones and can evade detec-

tion by trained classiﬁers. We expected that with an

increase in training epochs, the quality of the syn-

thetic images improves, rendering them progressively

more realistic and increasingly challenging for clas-

siﬁers to differentiate from real samples. This trend

should underscore the potential implications of GANs

in applications where realistic image synthesis is crit-

ical, as well as the challenges they may pose for cur-

rent diagnostic and classiﬁcation systems.

As a matter of fact, this study aims to evaluate

the capability of GAN in producing two-dimensional

medical images from six different datasets represent-

ing common pathologies. In a nutshell, we introduce

an approach designed to assess the potential impact

of GAN-generated retinal fundus images on classiﬁ-

cation tasks. Speciﬁcally, we employ a Deep Convo-

lutional GAN (DCGAN) to generate synthetic images

based on an existing dataset of retinal fundus images.

For this purpose, we exploit a set of of machine learn-

ing algorithms that, through image ﬁlters, evaluate the

quality of images generated by the GAN, providing

performance evaluation metrics, i.e., Precision, Re-

call, Accuracy and F-Measure.

The paper proceeds as follows: in the next sec-

tion we present the proposed method for bioimages

synthesis and detection, Section 3 presents the results

obtained from the experimental analysis and, ﬁnally,

in the last section conclusion and future research lines

are drawn.

2 THE METHOD

In this section we present the proposed method aimed

to understand whether it is possible by exploiting

machine learning to discriminate between real-world

bioimages and GAN-generated ones.

The proposed method is composed by two main

steps: the ﬁrst one is related to bioimages genera-

tion (shown in Figure 1), the second one is related

to discrimination between real and fake bioimages

(shown in Figure 2). Thus, the ﬁrst step of the pro-

posed method is the bioimage generation by means of

a GAN: this step is shown in Figure 1.

GAN is a machine learning model used to gener-

ate realistic data from random inputs. It consists of

two main components, a generator and a discrimina-

tor, that compete with each other, as shown in Figure

1. The idea is to train the generator to create data that

are indistinguishable from real data for the discrimi-

nator.

The generator will be completely trained when the

discriminator assigns a value of 0.5 to all images, de-

noting its inability to distinguish the inputs.

There are several variants of GAN, in this paper

we experiment with the DCGAN.

The developed DCGAN takes as input 28x28

pixel images, converted to grayscale and normalized

in the interval [-1,1], to align with the generator’s tanh

activation function. The generator receives a random

vector of size 100, which is transformed through a

series of layers: Dense, used to reorganize the noise

vector, BatchNormalization, to stabilize the training

process, Conv2DTranspose, to enlarge the image to a

format of 28x28x1.

The discriminator architecture includes, instead,

two sequences of Conv2D, BatchNormalization and

LeakyReLU to perform the downsampling, and the

sigmoid activation function in the last layer to obtain

a value suitable for binary classiﬁcation. The train-

ing of the DCGAN is based on the competition be-

tween generator and discriminator. This dynamic is

managed using a loss function, different between the

two networks. The loss of the discriminator is com-

puted as the arithmetic mean between the loss of the

real images, with class 1 label, and that of the gen-

erated images, of class 0. The loss of the generator,

instead, is represented by the loss of the generated im-

ages. Its purpose is to maximize the discriminator’s

predictions, so that it classiﬁes them with label 1.

The images generated during training were saved

for the next step, shown in Figure 2 where were com-

pared with the real ones.

The aim of the second step is to extract numeri-

cal features from the (real and fake) bioimages and to

build a set of machine learning models with the aim

to understand if it is possible to discriminate between

bioimages GAN-generated.

To obtain numerical features from real and gener-

ated bioimages, the images were subjected to a pre-

processing, applying image ﬁlters for the extraction

of features, as shown in Figure 2. This is essential to

optimize the classiﬁcation accuracy, since it allows to

visualize salient features regarding the structure and

the chromatic variations of the images.

In particular, the ﬁlters used are the following:

• AutoColorCorrelogramFilter: measures the spa-

tial correlation between the colors that compose

Bioimages Synthesis and Detection Through Generative Adversarial Network: A Multi-Case Study

333

Figure 1: The ﬁrst step of the proposed method related synthetic bioimages generation.

Figure 2: The second step of the proposed method related to synthetic bioimages detection.

the image, allowing to deﬁne the distribution of

the colors.

• BinaryPatternsPyramidFilter: extracts intensity

patterns around the points of the image, identify-

ing texture variations.

• ColorLayoutFilter: divides the image into a grid

of 64 blocks and calculates the average color for

each of them.

• FCTHFilter (Fuzzy Color and Texture Histogram

Filter): combines information on the color and on

the texture of the images in a single histogram, in

order to represent the main visual characteristics

of the image.

Once obtained a set of numerical features from

real and fake bioimages, as shown in Figure 2 we em-

ploy machine learning to perform classiﬁcation. We

consider four classiﬁcation algorithms, J48, Random

Forest, Random Tree and REPTree.

In the classiﬁcation of medical images, algorithms

based on decision trees are particularly effective. In

particular, among the most used, J48 and Random

Tree (Dou and Meng, 2023) are relevant. The latter

acts by considering different random characteristics

for each node of the tree and does not perform prun-

ing operations, that is, unnecessary branches are not

removed. Among the models based on classiﬁcation

trees, one of the most used is Random Forest, an al-

gorithm that combines several trees through Bagging,

based on the training of different models on subsets

of the dataset (Frank et al., 2016; Dou and Meng,

2023). This algorithm returns more stable models and

reduces variance. The aim of Random Forest is to im-

prove accuracy and reduce classiﬁcation error. It has

proven to be particularly effective in several medical

contexts, in particular in the classiﬁcation of images

depicting pathologies such as cancer (Dou and Meng,

2023).

Another classiﬁcation algorithm is REPTree

which, unlike the models based on decision trees

mentioned above, prunes unnecessary branches. Fur-

thermore, it is designed to sort numerical values only

once, which is why it is faster (Frank et al., 2016).

3 EXPERIMENTAL ANALYSIS

In this section we present the experimental analysis

we performed aimed to demonstrate the effectiveness

of the proposed method.

3.1 Dataset

In order to collect real-world bioimages belonging to

different biomedical domains, we resort to MedM-

NIST (Yang et al., 2024), a collection of datasets

for biomedical image classiﬁcation. The MedMNIST

collection includes 28x28 pixel bioimages, obtained

from real-world medical data and freely available for

research purposes.

In detail following datasets are exploited in the ex-

BIOIMAGING 2025 - 12th International Conference on Bioimaging

334

perimental analysis:

• PathMNIST, consisting of histological images of

colorectal cancer, stained with hematoxylin and

eosin;

• ChestMNIST, includes chest x-rays of fourteen

different diseases;

• DermaMNIST, it is based on dermatoscopic im-

ages of skin lesions, divided into seven diseases;

• OCTMNIST, it contains optical tomography im-

ages for the diagnosis of retinal diseases, divided

into four categories;

• PneumoniaMNIST, includes pediatric radio-

graphs for the classiﬁcation of pneumonia

images;

• RetinaMNIST, it contains retinal images for as-

sessing the severity of diabetic retinopathy.

For each dataset 1000 bioimages are considered.

3.2 Experimental Settings

To understand whether DCGAN is able to generate

synthetic bioimages that are closer to the real ones,

following metrics are computed: Precision is a mea-

sure of the correctly classiﬁed instances compared to

all instances in the dataset. It is deﬁned as:

Precision =

T P

T P + FP

(1)

with TP (true positive) the number of instances cor-

rectly identiﬁed as positive and FP (false positive) the

number of negative instances incorrectly classiﬁed as

positive.

Recall instead measures the ability of a model to cor-

rectly identify all positive instances and is deﬁned as

follows:

Recall =

T P

T P + FN

(2)

with FN (false negative) the number of instances in-

correctly classiﬁed as negative.

Accuracy represents the proportion of correctly clas-

siﬁed instances compared to all those present in the

dataset. This measure is not very valid in the case

in which the classes within the dataset are not well

balanced. Accuracy is deﬁned as:

Accuracy =

T P + TN

T P + T N + FP + FN

(3)

with TN (true negative) the number of instances cor-

rectly classiﬁed as negative.

F-Measure is another metric that is particularly useful

for measuring the model’s ability to recognize pat-

terns correctly. It is calculated as the harmonic mean

between Precision and Recall:

F − Measure = 2 ∗

Precision ∗ Recall

Precision + Recall

(4)

The classiﬁcation of bioimages generated by DC-

GAN was carried out by comparing 1000 real images

with the same number of generated images, selected

at ﬁve different training epochs (i.e., 0, 24, 49, 74 and

99). The idea is to evaluate whether, as the epochs in-

crease, more and more realistic images are generated

by the DCGAN for all the analyzed datasets.

For feature extraction and model building we re-

sort to Weka, one of the most widespread data mining

tool suite, presenting several machine learning algo-

rithm implementation.

3.2.1 Experimental Results

In this section we show the results obtained from the

experimental analysis. In particular, we exploited:

• six different datasets (i.e., PathMNIST, ChestM-

NIST, DermaMNIST, OCTMNIST, Pneumoni-

aMNIST and RetinaMNIST);

• four different image ﬁlters for feature extrac-

tion (i.e., AutoColorCorrelogramFilter, Binary-

PatternsPyramidFilter, ColorLayoutFilter and FC-

THFilter);

• four different classiﬁcation algorithms (i.e., J48,

Random Forest, Random Tree and REPTree);

• we trained a DCGAN for each dataset and we gen-

erated 1000 fake images for each epoch (for a to-

tal of 100 total epochs). Thus, for each dataset, we

generated 1000 images x 100 epochs = 100000 to-

tal images.

The aim of the experimental analysis is to explore

whether the images produced by the DCGAN as the

epochs increase are more similar to real bioimages.

For this reason, we train a series of binary classiﬁers,

where each classiﬁer is trained on 1000 images gen-

erated at a given epoch and 1000 real images. We

then evaluate the trend of the accuracy to draw con-

clusions.

As previously stated, we use different feature ex-

traction and different classiﬁcation algorithms for the

sake of generability of the proposed experimental

analysis, and we present the results obtained for ﬁve

epochs: the ﬁrst epoch, epoch 25, epoch 50, epoch 75

and epoch 100 (i.e., the last one).

Thus, for each dataset we build following models:

4 machine learning algorithms x 4 feature extraction

techniques x 100 epochs = 1600 models. Consider-

ing that we consider 6 different datasets to understand

the trend of images generated by DCGAN, in total we

considered 1600 x 6 = 9600 models.

Bioimages Synthesis and Detection Through Generative Adversarial Network: A Multi-Case Study

335

Table 1: PathMNIST Experimental analysis results.

Epoch Precision Recall F-Measure Acc.

0 0.90 0.90 0.89 0.90

24 0.81 0.81 0.81 0.81

49 0.79 0.79 0.79 0.79

74 0.81 0.81 0.81 0.81

99 0.81 0.81 0.80 0.80

Table 2: ChestMNIST Experimental analysis results.

Epoch Precision Recall F-Measure Acc.

0 0.99 0.99 0.99 0.99

24 0.99 0.99 0.99 0.99

49 0.99 0.99 0.99 0.99

74 1 1 0.99 0.99

99 0.99 0.99 0.99 0.99

We would expect that as the epochs increase, the

generated images are increasingly similar to the real

bioimages and therefore the classiﬁer trained, for ex-

ample, with real images and images generated at

epoch 100 will have worse performances than a clas-

siﬁer trained with real images and images generated,

for example, at epoch 25. This is because we ex-

pect that as the epochs increase, since the generated

images are increasingly similar to the originals, the

classiﬁcation algorithms will see them as increasingly

similar and therefore will not be able to discern them.

Tables 1, 2, 3, 4, 5, 6 shows the average Preci-

sion, Recall, F-Measure and Accuracy at ﬁve different

epochs.

For the PathMNIST dataset, which results are

shown in Table 1, the Accuracy trend was in line with

expectations. In the early training phases, the images

were of poor quality, while, as the epochs progressed,

the generator’s ability to produce realistic images im-

proved. The AutoColorCorrelogram image ﬁlter was

the most effective in distinguishing between the two

types of images, unlike BinaryPatternsPyramidFilter,

which led to a reduction in classiﬁcation performance.

The Random Forest algorithm proved to be the most

suitable in classifying this type of histological images.

Other classiﬁers, such as REPTree and Random Tree,

showed lower performances in the last epochs, sug-

gesting a greater difﬁculty in distinguishing between

real and synthetic images.

In the case of the ChestMNIST dataset, which re-

sults are shown in Table 2, the Accuracy trend was

not in line with the predictions, showing a slope con-

trary to expectations. The generated images were

poorly characterized, especially in the last epochs,

suggesting a greater difﬁculty of the generator for this

Table 3: DermaMNIST Experimental analysis results.

Epoch Precision Recall F-Measure Acc.

0 0.99 0.99 0.99 0.99

24 0.96 0.96 0.95 0.95

49 0.94 0.94 0.94 0.94

74 0.94 0.94 0.94 0.94

99 0.94 0.94 0.94 0.94

Table 4: OCTMNIST Experimental analysis results.

Epoch Precision Recall F-Measure Acc.

0 0.99 0.99 0.99 0.99

24 0.99 0.99 0.99 0.99

49 0.99 0.99 0.99 0.99

74 0.99 0.99 0.99 0.99

99 0.99 0.99 0.99 0.99

dataset. In the classiﬁcation, the combination of the

ColorLayoutFilter and the Random Tree algorithm

proved to be the most effective in distinguishing be-

tween the two types of images, although with worse

results than the previous dataset. In other ﬁlters, such

as the FCTHFilter, signiﬁcant limitations were noted

due to the lack of sufﬁcient information in the im-

age. Among the classiﬁcation algorithms, Random

Forest was the one with the best performance, while

REPTree showed greater effectiveness in the most ad-

vanced epochs.

For the images of the DermaMNIST dataset,

which results are shown in Table 3, the generator

showed a constant progression, with a signiﬁcant im-

provement of the images produced. In fact, in the ﬁrst

epochs, images lacking details were found, which im-

proved signiﬁcantly in the subsequent epochs. In this

context, AutoColorCorrelogramFilter and FCTHFil-

ter provided an evolution of the classiﬁcation met-

rics according to the expectations. On the contrary,

BinaryPatternsPyramidFilter, although useful in the

texture analysis, showed a reduction in overall per-

formance compared to the other image ﬁlters used.

Random Forest proved to be the most effective classi-

ﬁer, while Random Tree showed inferior results. The

trend of Accuracy conﬁrmed the expectations, with a

progressive decrease during the advancement of the

epochs, suggesting a good ability of the generator to

generate increasingly realistic images.

As for results obtained with the ChestMNIST

dataset, the results related to OCTMNIST dataset,

shown in Table 4, did not produce the expected re-

sults. During training, less complex images than ex-

pected emerged, denoting an almost ascending curve

in the epoch-Accuracy graph. Among the image ﬁl-

BIOIMAGING 2025 - 12th International Conference on Bioimaging

336

Table 5: PneumoniaMNIST Experimental analysis results.

Epoch Precision Recall F-Measure Acc.

0 0.99 0.99 0.99 0.99

24 0.99 0.99 0.99 0.99

49 0.99 0.99 0.99 0.99

74 0.99 0.99 0.99 0.99

99 0.99 0.99 0.99 0.99

Table 6: RetinaMNIST Experimental analysis results.

Epoch Precision Recall F-Measure Acc.

0 0.99 0.99 0.99 0.99

24 0.99 0.99 0.99 0.99

49 0.99 0.99 0.99 0.99

74 0.99 0.99 0.99 0.99

99 0.99 0.99 0.99 0.99

ters used, ColorLayoutFilter allowed a greater distinc-

tion between real and generated images, in particular

in combination with J48 and REPTree. BinaryPat-

ternsPyramidFilter showed better results in terms of

Accuracy, although not in line with the predictions.

The Accuracy trend was not in line with expecta-

tions, with an increase of this value in the last epochs.

This indicates a greater difﬁculty in generating im-

ages, which are not sufﬁciently realistic.

The PneumoniaMNIST dataset, which results are

shown in Table 5, showed particular results. In the

ﬁrst epochs, the generator produced images with little

detail, which progressively improved in quality until

epoch 49. Subsequently, a slight worsening in the per-

formance of the classiﬁer was observed, with a conse-

quent increase in Accuracy. BinaryPatternsPyramid-

Filter proved to be the best ﬁlter for the classiﬁca-

tion of pneumonia images, unlike the others, which

showed different results. In fact, it is noted how the

Accuracy values are very high during all the training

epochs, with the exception of epochs 49 or 74, de-

pending on the image ﬁlters and classiﬁcation algo-

rithms used. Among the latter, Random Forest was

the one that provided the best results. Overall, Ac-

curacy followed a trend similar to that expected, al-

though the slight increase in the last training epochs.

For RetinaMNIST dataset results, shown in Table

6, similar results were found to the previous dataset.

As a matter of fact, in the ﬁrst epochs, the gener-

ated images were lacking in detail, while in the fol-

lowing ones an increase in their quality was noted,

with a slight change in trend at epoch 74. The Auto-

ColorCorrelgoramFilter and BinaryPatternsPyramid-

Filter ﬁlters showed good results. Despite this, the

combination of ColorLayoutFilter with the Random

Tree algorithm was the one that produced a result

more consistent with expectations. The Accuracy

trend is in line with expectations, showing a general

improvement in the generator’s capabilities, despite a

slight drop in performance in the second half of train-

ing.

With the aim to provide a full overview of the ex-

perimental analysis results, Figure 3 shows the aver-

age accuracy for each epoch (from 0 to 100) for the

datasets involved in the experimental analysis.

In each plot shown in Figure 3 (one plot for each

dataset), the x-axis indicates the epochs, while the y-

axis is related to the average accuracy for each model

trained with the original bioimage dataset and the fake

images generated for a certain epoch. We note that

for some datasets like ChestMNIST and OCTMNIST

there is a consistent improvement in accuracy when

images obtained from higher epochs are considered.

From the other side, when are considered images gen-

erated from the PathMNIST, DermaMNIST, Pneumo-

niaMNIST, and RetinaMNIST exhibit a decline or

ﬂuctuation in accuracy over epochs, indicating that in

these cases the DCGAN is able to generated images

more similar to the real ones, as thee number of epoch

increases.

4 CONCLUSION AND FUTURE

WORK

In this paper we explored the possibility to exploit

GANS to generate realistic synthetic images, partic-

ularly in contexts such as the representation of com-

mon pathologies. In several cases, such as for PathM-

NIST and RetinaMNIST, the results conﬁrmed that

the model is able to generate visually realistic im-

ages. In the classiﬁcation process, the Accuracy trend

showed a progressive improvement over the training

epochs, indicating an increasing verisimilitude of the

synthetic images. However, for some datasets, includ-

ing ChestMNIST and OCTMNIST, the generated im-

ages did not reach the expected quality, resulting in

poor detail and easily distinguishable from real im-

ages. The limitations found, such as the non-uniform

quality of the images generated for some datasets, in-

dicate that there is still room for improvement in the

use of GANs for biomedical image generation. In par-

allel, the evolution of GANs raises ethical and secu-

rity issues. It will be necessary to implement mea-

sures to ensure the responsible use of these technolo-

gies, ensuring that synthetic images are not misused.

Strengthening systems that can accurately distinguish

between real and generated images, along with the

deﬁnition of guidelines for the use of GANs in the

Bioimages Synthesis and Detection Through Generative Adversarial Network: A Multi-Case Study

337

Figure 3: Average Accuracy while epochs are increasing for the multi-case study.

medical ﬁeld, will be crucial to ensure transparency

in the use of these technologies.

ACKNOWLEDGEMENTS

This work has been partially supported by EU DUCA,

EU CyberSecPro, SYNAPSE, PTR 22-24 P2.01 (Cy-

bersecurity) and SERICS (PE00000014) under the

MUR National Recovery and Resilience Plan funded

by the EU - NextGenerationEU projects, by MUR -

REASONING: foRmal mEthods for computAtional

analySis for diagnOsis and progNosis in imagING -

PRIN, e-DAI (Digital ecosystem for integrated anal-

ysis of heterogeneous health data related to high-

impact diseases: innovative model of care and re-

BIOIMAGING 2025 - 12th International Conference on Bioimaging

338

search), Health Operational Plan, FSC 2014-2020,

PRIN-MUR-Ministry of Health, the National Plan for

NRRP Complementary Investments D

∧

3 4 Health:

Digital Driven Diagnostics, prognostics and therapeu-

tics for sustainable Health care, Progetto MolisCTe,

Ministero delle Imprese e del Made in Italy, Italy,

CUP: D33B22000060001, FORESEEN: FORmal

mEthodS for attack dEtEction in autonomous driv-

iNg systems CUP N.P2022WYAEW and ALOHA: a

framework for monitoring the physical and psycho-

logical health status of the Worker through Object de-

tection and federated machine learning, Call for Col-

laborative Research BRiC -2024, INAIL.

REFERENCES

Dou, Y. and Meng, W. (2023). Comparative analysis

of weka-based classiﬁcation algorithms on medical

diagnosis datasets. Technology and Health Care,

31(S1):397–408. Available from: PMC10200164.

Frank, E., Hall, M. A., and Witten, I. H. (2016). The WEKA

Workbench. Morgan Kaufmann, 4th edition. On-

line Appendix for ”Data Mining: Practical Machine

Learning Tools and Techniques”.

Fu, H., Cheng, J., Xu, Y., Wong, D. W. K., Liu, J., and Cao,

X. (2018). Joint optic disc and cup segmentation based

on multi-label deep network and polar transformation.

IEEE transactions on medical imaging, 37(7):1597–

1605.

He, H., Yang, H., Mercaldo, F., Santone, A., and Huang, P.

(2024). Isolation forest-voting fusion-multioutput: A

stroke risk classiﬁcation method based on the multidi-

mensional output of abnormal sample detection. Com-

puter Methods and Programs in Biomedicine, page

108255.

Huang, P., He, P., Tian, S., Ma, M., Feng, P., Xiao, H.,

Mercaldo, F., Santone, A., and Qin, J. (2022). A vit-

amc network with adaptive model fusion and multiob-

jective optimization for interpretable laryngeal tumor

grading from histopathological images. IEEE Trans-

actions on Medical Imaging, 42(1):15–28.

Huang, P., Li, C., He, P., Xiao, H., Ping, Y., Feng, P., Tian,

S., Chen, H., Mercaldo, F., Santone, A., et al. (2024).

Mamlformer: Priori-experience guiding transformer

network via manifold adversarial multi-modal learn-

ing for laryngeal histopathological grading. Informa-

tion Fusion, 108:102333.

Huang, P., Tan, X., Zhou, X., Liu, S., Mercaldo, F., and

Santone, A. (2021). Fabnet: fusion attention block

and transfer learning for laryngeal cancer tumor grad-

ing in p63 ihc histopathology images. IEEE Journal

of Biomedical and Health Informatics, 26(4):1696–

1707.

Huang, P., Zhou, X., He, P., Feng, P., Tian, S., Sun, Y., Mer-

caldo, F., Santone, A., Qin, J., and Xiao, H. (2023).

Interpretable laryngeal tumor grading of histopatho-

logical images via depth domain adaptive network

with integration gradient cam and priori experience-

guided attention. Computers in Biology and Medicine,

154:106447.

Mercaldo, F., Brunese, L., Martinelli, F., Santone, A., and

Cesarelli, M. (2023). Generative adversarial net-

works in retinal image classiﬁcation. Applied Sci-

ences, 13(18):10433.

Mercaldo, F., Zhou, X., Huang, P., Martinelli, F., and San-

tone, A. (2022). Machine learning for uterine cervix

screening. In 2022 IEEE 22nd International Confer-

ence on Bioinformatics and Bioengineering (BIBE),

pages 71–74. IEEE.

Orlando, J. I., Barbosa Breda, J., Van Keer, K., Blaschko,

M. B., Blanco, P. J., and Bulant, C. A. (2018).

Towards a glaucoma risk index based on sim-

ulated hemodynamics from fundus images. In

Medical Image Computing and Computer Assisted

Intervention–MICCAI 2018: 21st International Con-

ference, Granada, Spain, September 16-20, 2018,

Proceedings, Part II 11, pages 65–73. Springer.

Yang, J., Shi, R., Wei, D., Liu, Z., Zhao, L., Ke, B., Pﬁster,

H., and Ni, B. (2024). Medmnist+ 18x standardized

datasets for 2d and 3d biomedical image classiﬁcation

with multiple size options: 28 (mnist-like), 64, 128,

and 224. Dataset, Version 3.0, Published on January

16, 2024. Accessed: 26 September 2024.

Zhou, X., Tang, C., Huang, P., Mercaldo, F., Santone, A.,

and Shao, Y. (2021). Lpcanet: classiﬁcation of laryn-

geal cancer histopathological images using a cnn with

position attention and channel attention mechanisms.

Interdisciplinary Sciences: Computational Life Sci-

ences, 13(4):666–682.

Zhou, X., Tang, C., Huang, P., Tian, S., Mercaldo, F., and

Santone, A. (2023). Asi-dbnet: an adaptive sparse

interactive resnet-vision transformer dual-branch net-

work for the grading of brain cancer histopathological

images. Interdisciplinary Sciences: Computational

Life Sciences, 15(1):15–31.

Bioimages Synthesis and Detection Through Generative Adversarial Network: A Multi-Case Study

339