U-Net in Medical Imaging: A Practical Pathway for AI Integration in

Healthcare

Martin Kryl

, Pavel Ko

san, Petr V

cel

and Jana Kle

ckov

Department of Computer Science and Engineering, University of West Bohemia, Univerzitni 8, Plzen, Czech Republic

{kryl, vcelak, kleckova}@kiv.zcu.cz

Keywords:

Medical Imaging, Deep Learning, U-Net, Clinical AI, Image Segmentation, Healthcare Technology.

Abstract:

As AI transforms medical imaging, this paper positions U-Net as a practical and enduring choice for segmenta-

tion tasks in constrained clinical environments. Despite rapid advancements in architectures like transformers

and hybrid models, U-Net remains highly relevant due to its simplicity, efﬁciency, and interpretability, particu-

larly in settings with limited computational resources and data availability. By exploring modiﬁcations such as

residual connections and the Tversky loss function, we argue that incremental reﬁnements to U-Net can bridge

the gap between current clinical needs and the potential of more advanced AI tools. This paper advocates for

a balanced approach, combining accessible enhancements with hybrid strategies, such as radiologist-informed

labeling and advanced preprocessing, to ensure immediate impact while building a foundation for future in-

novation. U-Net’s adaptability positions it as both a cornerstone of today’s AI integration in healthcare and a

stepping stone toward adopting next-generation models.

1 INTRODUCTION

In recent years, deep learning has revolutionized med-

ical imaging, offering advanced tools that enhance di-

agnostic support and improve the accuracy of medi-

cal data analysis. The healthcare sector, which gener-

ates vast volumes of data through modalities like CT,

MRI, and X-ray, presents an ideal opportunity for AI

applications to improve diagnostic efﬁciency and re-

liability. However, practical integration within clini-

cal environments remains challenging, often requiring

a balance between advanced model capabilities and

healthcare’s data and infrastructure limitations (Ron-

neberger et al., 2015).

Among the widely adopted models in medical

imaging, U-Net has become foundational for im-

age segmentation, especially due to its efﬁcient ar-

chitecture and success even with limited data. Ini-

tially designed for biomedical tasks, U-Net has been

adapted to various medical imaging applications, con-

sistently demonstrating reliable segmentation results

(Azad et al., 2024). Despite its strengths, the rapid de-

velopment of alternative architectures—such as trans-

formers, GANs, and hybrid models—has raised ques-

https://orcid.org/0000-0001-8077-7298

https://orcid.org/0000-0003-4415-790X

https://orcid.org/0000-0003-2050-6925

tions about U-Net’s continued relevance. Nonethe-

less, U-Net and its variants are still favored in settings

constrained by limited data, computational power,

and interpretability needs, making it a practical choice

in many clinical contexts (Ronneberger et al., 2015;

Azad et al., 2024) .

This paper assesses a modiﬁed U-Net model tai-

lored for brain CT scan segmentation, focusing on

its efﬁcacy and clinical viability. This model utilizes

the Tversky loss function to address class imbalance

(Salehi et al., 2017). Achieved results are indicative

of general effectiveness but with limitations in bound-

ary precision . These ﬁndings suggest that U-Net’s ar-

chitecture, even with its limitations, offers a balanced

and pragmatic approach for clinical use. This is par-

ticularly relevant as healthcare facilities continue to

face signiﬁcant barriers in adopting more complex ar-

chitectures, underlining the ongoing relevance of U-

Net in real-world medical imaging.

As the ﬁeld evolves, architectures like Vision

Transformers and advanced CNNs hold promise for

greater accuracy and ﬂexibility (Shamshad et al.,

2023). However, their requirements for extensive

computational resources and large datasets may hin-

der clinical feasibility (Shamshad et al., 2023). Con-

sequently, this paper advocates for continuous reﬁne-

ment of U-Net-based models, emphasizing an ap-

proach that prioritizes clinical accessibility. By en-

828

Kryl, M., Košan, P., V

celák, P. and Kle

cková, J.

U-Net in Medical Imaging: A Practical Pathway for AI Integration in Healthcare.

DOI: 10.5220/0013314600003911

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025) - Volume 2: HEALTHINF, pages 828-833

ISBN: 978-989-758-731-3; ISSN: 2184-4305

hancing U-Net’s robustness and adaptability, health-

care providers can leverage AI advancements within

current infrastructural constraints while building a

foundation for future, more sophisticated integra-

tions. By progressively enhancing foundational mod-

els, healthcare systems can lay the groundwork for

incorporating more complex models, facilitating AI-

driven improvements in medical imaging.

2 BACKGROUND AND

RELEVANCE

The evolution of deep learning has signiﬁcantly

shaped medical imaging, enabling precise analy-

sis and insights through models trained on exten-

sive datasets. Early convolutional neural networks

(CNNs), such as U-Net, were speciﬁcally designed

to address the complexities of biomedical image seg-

mentation. U-Net’s encoder-decoder structure, com-

plemented by skip connections, allows the model to

capture both high-level features and ﬁne-grained de-

tails, making it highly effective for various medical

segmentation tasks. (Ronneberger et al., 2015)

While deep learning continues to progress, and

newer models are emerging with potential improve-

ments in accuracy and generalization, the accessibil-

ity of these models remains limited. Transformers, for

instance, introduce self-attention mechanisms that en-

able the model to dynamically assess the importance

of different image regions. Generative Adversarial

Networks (GANs) offer potential for generating high-

ﬁdelity images, useful for data augmentation and en-

hancement. Hybrid models that combine CNNs with

transformer-based layers have also been explored to

leverage the strengths of both architectures. (Pu et al.,

2024)

However, these advanced models often require

signiﬁcant memory and computational resources, de-

manding high-performance hardware that may be un-

available in many clinical settings. Additionally, their

reliance on large, diverse datasets poses a challenge

in medical imaging, where data access is often con-

strained due to privacy considerations and limited

variability in available datasets. This makes complex

architectures less feasible in many clinical settings,

where interpretability and accountability are also crit-

ical for diagnostic decision-making. (Ronneberger

et al., 2015)

Despite these recent advances, U-Net and its

derivatives continue to hold relevance, particularly in

constrained environments. U-Net’s simplicity makes

it feasible to implement on accessible hardware, yet it

still produces reliable segmentation results. By focus-

ing on incremental improvements, such as the Tver-

sky loss function or selective attention mechanisms,

healthcare providers can leverage U-Net’s capabilities

as a bridge toward integrating more advanced archi-

tectures over time. (Ronneberger et al., 2015)

This paper advocates for a balanced approach that

prioritizes the reﬁnement and application of U-Net-

based models in real-world clinical contexts. By fo-

cusing on incremental improvements to the U-Net ar-

chitecture, such as enhanced loss functions and in-

creased robustness, healthcare providers can leverage

deep learning’s beneﬁts within existing infrastructural

limits, paving the way for gradual adoption of cutting-

edge models as technology and data accessibility im-

prove.

3 METHODOLOGY AND MODEL

ARCHITECTURE

To address the speciﬁc needs of brain CT scan seg-

mentation in a clinical setting, we utilized a modi-

ﬁed U-Net model designed to handle challenges re-

lated to class imbalance and constrained data reso-

lution. U-Net’s encoder-decoder structure, with skip

connections that preserve spatial information across

layers, provides a strong foundation for medical im-

age segmentation tasks where capturing both detailed

and high-level features is critical. This architectural

choice is particularly advantageous in settings with

limited computational resources and data, making it

an accessible yet effective option for clinical applica-

tions.

3.1 Modiﬁed U-Net Architecture

The U-Net model was adapted in several ways to im-

prove its performance on the task of brain CT segmen-

tation. One modiﬁcation was the inclusion of resid-

ual connections (He et al., 2016) within the encoder

and decoder blocks. These residual connections al-

low the network to add activations from earlier lay-

ers directly to the outputs of deeper layers, enhancing

the model’s ability to retain and propagate contextual

information. By summing the activations, this modi-

ﬁcation allows the U-Net model to capture both local

(edges and small structures) and global features (over-

all context of the brain scan) more effectively, making

it suitable for identifying subtle structures in medical

images, such as lesions or infarctions.

Another modiﬁcation was the use of the Tversky

loss function instead of the standard cross-entropy

loss. The Tversky loss addresses class imbalance by

allowing for ﬁne-tuning of false positives and false

U-Net in Medical Imaging: A Practical Pathway for AI Integration in Healthcare

829

negatives, which is especially beneﬁcial in medical

segmentation tasks where certain regions may be less

prominent. By adjusting this balance, the model be-

comes more effective in capturing smaller regions that

may otherwise be overlooked in traditional loss func-

tion setups (Sudre et al., 2017; Abraham and Khan,

2019).

In addition to architectural modiﬁcations, data

augmentation techniques inspired by (Shorten and

Khoshgoftaar, 2019; Nemoto et al., 2021) were ap-

plied to improve model robustness given the limited

dataset size. This preprocessing approach aligns with

the model’s goal of achieving high segmentation ac-

curacy without requiring an extensive dataset, which

is often impractical in clinical environments due to

data access, privacy concerns or the amount of ef-

fort needed to prepare large quality datasets for model

learning.

3.2 Rationale for U-Net Selection

The decision to utilize a modiﬁed U-Net over more re-

cent architectures, such as Vision Transformers or hy-

brid models, was driven by several practical consider-

ations. Unlike more complex models, U-Net’s archi-

tecture is relatively lightweight and can be deployed

on standard hardware conﬁgurations commonly avail-

able in healthcare facilities. This simplicity, com-

bined with U-Net’s demonstrated effectiveness in seg-

mentation tasks, offers a feasible approach to intro-

ducing AI-driven diagnostics in clinical settings with-

out extensive infrastructure upgrades.

Furthermore, U-Net’s interpretability provides an

additional advantage over newer architectures. In

a clinical setting, where transparency is crucial, U-

Net’s straightforward encoder-decoder structure al-

lows for greater model interpretability, making it eas-

ier for clinicians to understand and trust the segmen-

tation outputs. Given that interpretability and ac-

countability are critical for clinical adoption as noted

in (Siddique et al., 2021), U-Net’s design strikes a

balance between accuracy and comprehensibility that

newer, more complex architectures may not offer as

readily.

3.3 Dataset Preparation and Training

The model was trained on a curated local dataset

of 50 brain CT image series. Each series included

core and penumbra segmentation masks derived from

automated and semi-automated techniques. Specif-

ically, penumbra regions were automatically labeled

using a custom script that analyzed cerebral blood

ﬂow (CBF) and cerebral blood volume (CBV) maps,

leveraging standard clinical thresholds to identify is-

chemic but salvageable tissue. These initial masks

were subsequently reviewed and validated by an ex-

perienced radiologist to ensure that the segmentations

aligned with clinical expectations. Series with dis-

puted or ambiguous regions were excluded from the

dataset, ensuring high-quality annotations.

Images in each series were downsampled to a

resolution of 256x256 pixels to optimize process-

ing efﬁciency while retaining the essential features

for segmentation. This resolution was selected to

align with realistic data limitations in clinical environ-

ments, where high-resolution images may not always

be feasible to handle due to storage and processing

constraints. Furthermore, the chosen resolution and

model design ensure that segmentation tasks can be

performed swiftly, an important consideration in clin-

ical workﬂows where timely results are crucial.

Data preprocessing included standard normal-

ization to ensure consistent intensity ranges across

images, improving model stability during training.

Given the relatively small size of the dataset, aug-

mentation techniques including random rotations, im-

age translations and offsets, horizontal and vertical

ﬂips, and brightness adjustments were applied to en-

hance model robustness in accordance to ﬁndings in

(Shorten and Khoshgoftaar, 2019; Siddique et al.,

2021). Augmentations were only applied to slices

containing regions of interest to maximize the rel-

evance of the augmented data while avoiding un-

necessary transformations of non-informative slices.

These augmentations expanded the effective size of

the dataset and mitigated the risk of overﬁtting.

The model was trained over 35 epochs, with the

stopping point determined empirically based on the

progression of the loss function on the validation set.

This early stopping criterion was chosen to prevent

overﬁtting while ensuring adequate convergence.

4 RESULTS AND

PERFORMANCE EVALUATION

The modiﬁed U-Net model’s performance was eval-

uated using key metrics standard in medical image

segmentation: the Dice coefﬁcient and the Tversky

coefﬁcient. These metrics assess the overlap accu-

racy between the predicted segmentation and the ref-

erence labels, providing insights into general segmen-

tation accuracy and the model’s handling of class im-

balances.

HEALTHINF 2025 - 18th International Conference on Health Informatics

830

4.1 Performance Metrics and Outcomes

On the validation set, the model achieved a Dice coef-

ﬁcient of 0.61 and a Tversky coefﬁcient of 0.67. The

Dice coefﬁcient reﬂects the model’s overall perfor-

mance in matching target segmentation regions, while

the Tversky coefﬁcient indicates its effectiveness in

managing class imbalances. The application of the

Tversky loss function during training enhanced the

model’s sensitivity to less prominent regions in the

CT images, such as smaller lesions, which might oth-

erwise have been underrepresented.

These metrics suggest that the model can approx-

imate the general area of the segmentation target ac-

curately, though challenges remain in achieving pre-

cise boundary alignment. While these metrics may

appear modest, they reﬂect the inherent difﬁculty of

the task: segmenting small, subtle structures like is-

chemic penumbra regions from noisy CT images.

These challenges are further compounded by the con-

strained dataset size (50 series) and the necessity of

downsampling images to 256x256 resolution for prac-

tical deployment. This trade-off aligns with known

limitations of U-Net architectures in high-precision

medical applications, where complex structures may

require more reﬁned model adjustments or larger,

higher-resolution datasets.

Additionally, manual annotations of medical im-

ages often exhibit variability between radiologists,

with inter-annotator Dice scores sometimes falling

within similar ranges in comparable tasks. This

model’s performance aligns with the general accu-

racy achieved by a domain expert annotating a new

dataset, though precise quantitative comparisons were

unavailable. Nonetheless, the results indicate that the

modiﬁed U-Net, even with its relatively simple struc-

ture, can deliver meaningful outcomes in scenarios

with limited data and computational resources.

4.2 Comparative Analysis with Local

Delineation Tool

To evaluate the modiﬁed U-Net model’s effectiveness,

a comparative analysis was performed using segmen-

tation outputs from a locally developed tool designed

for infarct core delineation in brain CT imaging.

The tool is successor to the Delineator published in

(Maule et al., 2013). Although the tool has not be-

come widely adopted outside its original setting, it

provides a baseline segmentation that facilitates data

labeling. This allowed us to generate more labeled

training data than would have been feasible with man-

ual radiologist labeling alone.

The use of the tool enabled approximate delin-

eation of the infarct core regions in the dataset, pro-

viding a reference standard against which the U-Net

model could be evaluated. However, it’s important

to acknowledge that both the software tool and man-

ual radiologist annotations may contain inaccuracies.

Without a rigid, multi-annotator labeling process, ob-

jectively determining the segmentation accuracy of

either approach remains challenging.

Despite these limitations, the U-Net model’s out-

puts closely aligned with the broad areas identiﬁed

by the tool, capturing the primary regions of interest

with reasonable accuracy. This consistency suggests

that the U-Net model, even with its simpler archi-

tecture, is suitable for approximate segmentation in

cases where precise boundary conformity may be sec-

ondary to general region identiﬁcation. In scenarios

where exact segmentation is not strictly required, U-

Net offers a viable alternative to more complex soft-

ware solutions, especially in settings where computa-

tional and resource constraints are signiﬁcant consid-

erations.

4.3 Interpretation of Results and

Trade-Offs

The performance of the modiﬁed U-Net model re-

ﬂects both the strengths and trade-offs of using a

U-Net-based approach in medical imaging. The

model succeeded in identifying the regions of inter-

est broadly, providing a valuable tool for clinicians

seeking approximate segmentation. However, its lim-

itations in ﬁne-grained boundary alignment indicate

that, while U-Net can approximate the segmentation

task, it may not be able to fully replace specialized

software without further enhancements.

These results highlight a pragmatic pathway for

using U-Net in real-world clinical settings: the model

can offer reliable, interpretable segmentation without

extensive infrastructure requirements, but as noted in

(Isensee et al., 2021), additional adjustments or hy-

brid approaches may be necessary for applications re-

quiring high precision. In constrained environments,

where data access, computational power, and inter-

pretability are signiﬁcant considerations, this U-Net-

based model demonstrates that effective segmentation

is achievable with thoughtful modiﬁcations, even as

the ﬁeld of medical imaging continues to evolve.

5 POSITION STATEMENT

As AI integrates more deeply into healthcare, U-Net’s

practical architecture and strong performance make it

U-Net in Medical Imaging: A Practical Pathway for AI Integration in Healthcare

831

highly suited for clinical applications, especially un-

der the infrastructural limitations that constrain many

medical settings. While novel architectures—such

as transformers and hybrid networks—offer higher

precision and richer context through attention mech-

anisms, they demand substantial computational re-

sources and interpretability solutions, challenging im-

mediate adoption (Henry et al., 2022). This work does

not argue that U-Net supersedes more advanced ar-

chitectures; rather, it highlights U-Net’s enduring rel-

evance and adaptability in settings where simplicity,

efﬁciency, and interpretability are critical.

This paper advocates for incrementally enhancing

U-Net-based models, emphasizing simplicity, clin-

ical interpretability, and targeted modiﬁcations like

the Tversky loss function to manage class imbal-

ances. By reﬁning U-Net within these limits, health-

care providers can implement AI-based segmentation

today without the extensive resources newer models

often require.

Moreover, as a foundational model, U-Net can

be further developed to incorporate aspects of ad-

vanced architectures such as attention mechanism in

(Pu et al., 2024). This hybridization pathway en-

ables healthcare facilities to integrate transformer-

based attention or other advanced techniques selec-

tively, building a bridge to complex, data-intensive

models while meeting current clinical needs. This

progressive approach enables real-world impact today

while paving the way for advanced model integration

as data, computational resources, and clinical AI fa-

miliarity expand.

6 FUTURE WORK

We acknowledge the potential for further strength-

ening our ﬁndings through additional experiments

and comparisons. An ablation study examining the

contributions of the proposed modiﬁcations, such

as residual connections and the Tversky loss func-

tion, could provide deeper insights into their indi-

vidual and combined impacts on segmentation per-

formance. Similarly, an empirical comparison be-

tween U-Net and modern architectures, such as Vi-

sion Transformers or hybrid models, would help to

better demonstrate the trade-offs between computa-

tional efﬁciency, data requirements, and segmentation

accuracy. Finally, detailed experiments incorporating

other state-of-the-art approaches, particularly those

leveraging hybrid strategies or advanced preprocess-

ing techniques, could contextualize U-Net’s perfor-

mance within a broader framework. These lead into

several speciﬁc research directions highlighted below.

6.1 Hybrid Labeling with Radiologist

Input

Combining radiologist oversight with automated seg-

mentation creates a hybrid labeling approach that en-

hances data quality and enables valuable model re-

ﬁnements. Allowing experts to adjust AI outputs dur-

ing the labeling process improves segmentation re-

liability and provides feedback loops that contribute

to ongoing model improvements. This collaborative

strategy leverages the strengths of both human exper-

tise and machine efﬁciency, which could lead to more

accurate and clinically relevant outcomes.

6.2 Advanced Preprocessing with

Complex Models

Leveraging sophisticated architectures, such as trans-

formers, for preprocessing can enrich datasets for

simpler models like U-Net. This tiered approach pro-

vides high-quality features that simpler models can

efﬁciently utilize, allowing computationally feasible

models to beneﬁt from the strengths of cutting-edge

feature extraction. By integrating advanced prepro-

cessing techniques, the performance of established

models can be signiﬁcantly enhanced without neces-

sitating substantial computational resources.

6.3 Navigating Diagnostic Uncertainty

Acknowledging the absence of absolute truth in med-

ical imaging, future work should address inherent in-

accuracies in both human and software assessments.

Developing conﬁdence metrics and quality-assurance

feedback loops, particularly with radiologist input,

can enhance reliability, helping to mitigate biases

across human and AI judgments. Implementing these

measures ensures that diagnostic processes are trans-

parent and that uncertainties are systematically man-

aged, leading to more trustworthy clinical decisions.

6.4 Integration of Hybrid Models and

Long-Term Deployment Studies

Research should explore the gradual integration of

hybrid U-Net models into clinical workﬂows, in-

corporating components from advanced architectures

like transformers without overwhelming clinical re-

sources. By deploying these hybrid models in clini-

cal settings for extended studies, researchers can ad-

dress practical deployment challenges, contributing

insights that prepare healthcare facilities for eventual

transitions to fully advanced models. This phased ap-

HEALTHINF 2025 - 18th International Conference on Health Informatics

832

proach allows for the assessment of real-world per-

formance and the identiﬁcation of necessary adjust-

ments, facilitating a smoother adoption of AI tech-

nologies in healthcare.

7 CONCLUSION

This paper has emphasized the enduring relevance

and adaptability of U-Net-based architectures in med-

ical imaging, highlighting their effectiveness and

practicality in clinical environments often constrained

by limited resources and data. U-Net’s simplicity,

interpretability, and robustness make it particularly

well-suited to meet healthcare’s immediate needs, of-

fering reliable segmentation with manageable compu-

tational demands. By integrating targeted enhance-

ments, U-Net-based models serve as a bridge between

traditional diagnostic tools and the transformative po-

tential of deep learning.

Incremental enhancements, such as attention

mechanisms and reﬁned loss functions, allow U-Net

to improve without requiring signiﬁcant infrastruc-

ture upgrades. These modiﬁcations provide a prac-

tical pathway for increasing segmentation accuracy

while preparing for the eventual integration of more

advanced architectures.

Additionally, recognizing the inherent lack of an

objective truth in medical imaging, this paper advo-

cates for hybrid approaches that incorporate radiol-

ogist feedback and advanced preprocessing methods

to enhance data quality and model accuracy. These

pragmatic strategies facilitate AI adoption in clinical

workﬂows while supporting the development of ro-

bust, quality-assurance frameworks to reduce biases

in both AI outputs and clinician interpretations.

Ultimately, this paper supports a balanced, pro-

gressive approach to AI integration in healthcare. U-

Net serves as a practical bridge between traditional

tools and next-generation AI, enabling real-world im-

pact today while laying the groundwork for sophisti-

cated, data-intensive models in the future.

REFERENCES

Abraham, N. and Khan, N. M. (2019). A novel focal tver-

sky loss function with improved attention u-net for le-

sion segmentation. In 2019 IEEE 16th international

symposium on biomedical imaging (ISBI 2019), pages

683–687. IEEE.

Azad, R., Aghdam, E. K., Rauland, A., Jia, Y., Avval, A. H.,

Bozorgpour, A., Karimijafarbigloo, S., Cohen, J. P.,

Adeli, E., and Merhof, D. (2024). Medical image seg-

mentation review: The success of u-net. IEEE Trans-

actions on Pattern Analysis and Machine Intelligence.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-

ual learning for image recognition. In Proceedings of

the IEEE conference on computer vision and pattern

recognition, pages 770–778.

Henry, E. U., Emebob, O., and Omonhinmin, C. A. (2022).

Vision transformers in medical imaging: A review.

arXiv preprint arXiv:2211.10043.

Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., and

Maier-Hein, K. H. (2021). nnu-net: a self-conﬁguring

method for deep learning-based biomedical image

segmentation. Nature methods, 18(2):203–211.

Maule, P., Kle

ckov

a, J., Rohan, V., and Tup

y, R. (2013).

Automated infarction core delineation using cere-

bral and perfused blood volume maps. International

journal of computer assisted radiology and surgery,

8:787–797.

Nemoto, T., Futakami, N., Kunieda, E., Yagi, M., Takeda,

A., Akiba, T., Mutu, E., and Shigematsu, N. (2021).

Effects of sample size and data augmentation on u-

net-based automatic segmentation of various organs.

Radiological Physics and Technology, 14:318–327.

Pu, Q., Xi, Z., Yin, S., Zhao, Z., and Zhao, L. (2024). Ad-

vantages of transformer and its application for medical

image segmentation: a survey. BioMedical Engineer-

ing OnLine, 23(1):14.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-

net: Convolutional networks for biomedical image

segmentation. In Medical image computing and

computer-assisted intervention–MICCAI 2015: 18th

international conference, Munich, Germany, October

5-9, 2015, proceedings, part III 18, pages 234–241.

Springer.

Salehi, S. S. M., Erdogmus, D., and Gholipour, A. (2017).

Tversky loss function for image segmentation using

3d fully convolutional deep networks. In International

workshop on machine learning in medical imaging,

pages 379–387. Springer.

Shamshad, F., Khan, S., Zamir, S. W., Khan, M. H., Hayat,

M., Khan, F. S., and Fu, H. (2023). Transformers in

medical imaging: A survey. Medical Image Analysis,

88:102802.

Shorten, C. and Khoshgoftaar, T. M. (2019). A survey on

image data augmentation for deep learning. Journal

of big data, 6(1):1–48.

Siddique, N., Paheding, S., Elkin, C. P., and Devabhaktuni,

V. (2021). U-net and its variants for medical image

segmentation: A review of theory and applications.

IEEE access, 9:82031–82057.

Sudre, C. H., Li, W., Vercauteren, T., Ourselin, S., and

Jorge Cardoso, M. (2017). Generalised dice over-

lap as a deep learning loss function for highly unbal-

anced segmentations. In Deep Learning in Medical

Image Analysis and Multimodal Learning for Clini-

cal Decision Support: Third International Workshop,

DLMIA 2017, and 7th International Workshop, ML-

CDS 2017, Held in Conjunction with MICCAI 2017,

ebec City, QC, Canada, September 14, Proceed-

ings 3, pages 240–248. Springer.

U-Net in Medical Imaging: A Practical Pathway for AI Integration in Healthcare

833