SBC-UNet3+: Classiﬁcation of Nuclei in Histology Imaging Based on

Multi Branch UNET3+ Segmentation Model

Roua Jaafar

1,2,3 a

, Hedi Yazid

3 b

, Wissem Farhat

and Najoua Essoukri Ben Amara

1 c

Universit

e de Sousse, Ecole Nationale d’Ing

enieurs de Sousse, LATIS - Laboratory of Advanced Technology and

Intelligent Systems, 4023, Sousse, Tunisia

Universit

e de Sousse, Institut Sup

erieur d’Informatique et des Technologies de Communication de Sousse,

4011, Sousse, Tunisia

Institut Sup

erieur d’Electronique de Paris (ISEP), 10 rue de Vanves, Issy-les-Moulineaux, 92130, France

{jaafar.roua, wissemfarhat07}@gmail.com, hedi.yazid@isep.fr, najoua.benamara@eniso.rnu.tn

Keywords:

Histology, Multi-Class Segmentation, Boundary Detection, Classiﬁcation, UNET3+, Multi-Branch.

Abstract:

Histological images are crucial for cancer diagnosis and treatment, providing valuable information about cel-

lular structures and abnormalities. Deep learning has emerged as a promising tool to automate the analysis of

histological images, especially for tasks like cell segmentation and classiﬁcation, which aim to improve cancer

detection efﬁciency and accuracy. Existing methods, show promising results in segmentation and classiﬁca-

tion but are limited in handling overlapping nuclei and boundary delineation. We propose a cell segmentation

and classiﬁcation approach applied to histological images, part of a Content-Based Histopathological Im-

age Retrieval (CBHIR) project. By integrating boundary detection and classiﬁcation-guided modules, our

approach overcomes the limitations of existing methods, enhancing segmentation precision and robustness.

Our approach leverages deep learning models and the UNET3+ architecture, comparing its performance with

state-of-the-art methods on the PanNuke Dataset (Gamper et al., 2020)

∗

. Our multitask approach outperforms

current models in F1-score and recall, demonstrating its potential for accurate and efﬁcient cancer diagnosis.

1 INTRODUCTION

Whole-slide imaging (WSI) Segmentation of key

components of whole-slide images is essential for

cancer diagnosis and other pathology-related analy-

ses. It involves identifying and isolating structures

like nuclei or cells, essential for accurate diagnos-

tics. Accurate segmentation signiﬁcantly affects the

diagnostic process, as it allows differentiating be-

tween different tissue and nucleus types, which is

essential for patient assessment (Chen et al., 2019).

Challenges in WSI segmentation include variability

in nuclei sizes, shapes, clustering, and overlapping,

contributing to under-segmentation. In addition, the

structure of tissues, such as glands, can be highly de-

generate, making discrimination difﬁcult. Traditional

methods like morphological processing and cluster-

ing struggle with complex variations. Deep learning

algorithms, such as U-NET, DeepLabV3, and GANs,

https://orcid.org/0000-0003-4210-2665

https://orcid.org/0000-0001-8188-3797

https://orcid.org/0000-0001-7914-0644

∗

https://warwick.ac.uk/fac/cross fac/tia/data/pannuke

address these challenges by learning complex pat-

terns and structures. State-of-the-art models, such as

Cellpose and OmniPose, improve generalization but

struggle with boundary detection and overlapping nu-

clei in dense images.

Histological segmentation enables applications

like tissue and cell analysis. Our proposed SBC-

UNet3+ integrates segmentation, boundary detection,

and classiﬁcation-guided modules, enhancing seg-

mentation precision for overlapping and irregular nu-

clei.

To summarize, histological image segmentation

uses traditional and deep learning methods to address

challenges in WSI. Our proposed multi-branch SBC-

UNet3+ improves cell segmentation, boundary de-

tection, and classiﬁcation accuracy, facilitating better

cancer diagnosis and research. The remaining of the

paper is organized as follows. Section 2 brieﬂy intro-

duces previous approaches related to cell segmenta-

tion and classiﬁcation for histological images. The

proposed approach for segmentation and classiﬁca-

tion of Nuclei in Histology Imaging based on Multi-

Branch UNET3+ is presented in Section 3. In Section

Jaafar, R., Yazid, H., Farhat, W. and Ben Amara, N. E.

SBC-UNet3+: Classiﬁcation of Nuclei in Histology Imaging Based on Multi Branch UNET3+ Segmentation Model.

DOI: 10.5220/0013232900003912

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2025) - Volume 3: VISAPP, pages

601-609

ISBN: 978-989-758-728-3; ISSN: 2184-4321

601

4, experiments are carried out, and obtained results on

the PanNuke Dataset are illustrated. Conclusion and

future work are drawn in Section 5.

2 RELATED WORKS

Several studies have investigated deep learning-based

approaches for cell segmentation in histological im-

ages (Feng et al., 2021), (Khuriwal and Mishra,

2018), (Zhao et al., 2020). Traditional convolu-

tional neural networks (CNNs), such as U-Net (Ron-

neberger et al., 2015), have been widely used for this

task, demonstrating promising results in various ap-

plications (Ibtehaz and Rahman, 2020). However,

to improve segmentation accuracy, especially for

complex cellular structures, integrating self-attention

mechanisms can be beneﬁcial for capturing long-

range dependencies in images.

Despite the success of these methods, prominent

models like Cellpose (Stringer et al., 2020) and Om-

niPose (Cutler et al., 2022) have introduced advanced

architectures speciﬁcally designed for cell segmenta-

tion. Cellpose emphasizes generalizability through

pre-trained models and dynamic scale adaptation but

lacks explicit mechanisms for handling overlapping

nuclei. OmniPose extends Cellpose by incorporating

scale-invariant features and improving instance seg-

mentation; however, both models face challenges in

precise boundary delineation and struggle with dense

histological data. These limitations underscore the

need for a multi-branch approach that integrates seg-

mentation and boundary detection.

In the following, we will focus on the state-of-

the-art of multi-branch approaches for histological

image segmentation and classiﬁcation. These tech-

niques involve combining multiple branches or path-

ways within the neural network architecture to exploit

different types of information, such as cell segmenta-

tion and boundary detection. In doing so, we aim to

improve the accuracy and robustness of our model in

cell segmentation and classiﬁcation from histopatho-

logical images.

Many techniques using a multi-branch decoder

strategy have evolved, such as HoVer-Net (Graham

et al., 2019), which classiﬁes nuclei types in differ-

ent organs and overcomes class imbalance through

a ResNet50 encoder and three independent decoders

based on densely connected networks (nuclear pixel,

horizontal-vertical maps (HoVer), and nuclear classi-

ﬁcation (i.e., instance segmentation). Also, authors

in HookNet (van Rijthoven et al., 2020) have pro-

posed a model for multi-class tissue segmentation for

breast cancer, using context and target branches to im-

prove detail extraction using a U-Net backbone and a

decoder involving 2x2 nearest neighbor scaling fol-

lowed by convolutional layers. This architecture al-

lows the integration of context branch information

into the target branch, facilitating multi-resolution

representations at different depths, achieved by con-

catenating relevant features between branches in the

decoder sections through a hooking mechanism. A

single encoder and three parallel decoders for mask

predictions, contour prediction, and distance map

estimation, reﬁning mask boundaries, have been

proposed. While an approach that applies multi-

resolution deconvolution ﬁlters and segments various

object types (nuclei, cells, glands) was detailed in

(Raza et al., 2019). This approach was applied with

different image staining modalities: ﬂuorescence and

also Hematoxylin and Eosin (H&E) while adjusting

the input parameters to create a uniﬁed framework for

segmentation of various object types. PointNu-Net

(Yao et al., 2024) is a multi-branch model that detects,

segments, and classiﬁes nuclei from histopathology

data by predicting keypoints at the center of each nu-

cleus. It utilizes dynamic convolution for instance

segmentation, combining outputs from feature and

kernel branches. The Joint Pyramid Fusion Mod-

ule improves feature aggregation for multi-scale data,

eliminating the need for post-processing and non-

maximum suppression (NMS). Although these meth-

ods demonstrate substantial advancements, they of-

ten lack a uniﬁed approach to simultaneously ad-

dress segmentation, boundary detection, and classi-

ﬁcation. By comparing our proposed SBC-UNet3+

with these models, we aim to showcase how integrat-

ing boundary-sensitive features enhances segmenta-

tion robustness, particularly in dense histological im-

ages. In the following table 1, we propose a compara-

tive study between the different works in the literature

that have proposed a deep multi-branch architecture

for the segmentation and classiﬁcation of histological

images. By carrying out a comprehensive evaluation

of their performance, we aim to identify the most suit-

able model for accurate and efﬁcient cancer detection.

3 PROPOSED APPROACH

Our proposed approach is part of a larger project

aimed at advancing automated cancer detection

through segmentation of cell instances in histological

images. The main objective is to identify the most ef-

fective deep learning model for accurate cell segmen-

tation and classiﬁcation, thereby enhancing the accu-

racy and efﬁciency of cancer diagnosis. Our approach

involves training and evaluating the UNET3+ deep

VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications

602

Table 1: Summary of Nucleus Segmentation and Classiﬁcation Related Works and Their Performance.

Ref. Description Databases Performance

CPP-Net (Chen

et al., 2023)

Nucleus segmentation using multi-point sampling to enhance con-

textual information and predictions incorporating a Conﬁdence-Based

Weighting Module to fuse predictions and introducing a novel Shape-

Aware Perceptual loss.

DSB2018;

BBBC06; Pan-

Nuke dataset

mPQ: 0.48

bPQ: 0.68

HDA-Net (Im

et al., 2024)

Dual-encoder architecture, incorporating H&E and residual (HER), and

RGB streams to extract and combine color-invariant and discrimina-

tive features using a Hierarchical Dual Attention (HDA) module and a

Single-Source Attention Module (SAM) to enhance feature representa-

tion.

CoNSeP

PanNuke Dataset

Kumar

AJI: 69.60

Dice: 82.03

Recall: 82.84

Precision: 81.31

CellViT (H

orst

et al., 2024)

Cell segmentation using a U-Net-like architecture with a Vision Trans-

former (ViT) backbone and multi-branch decoder. The forward method

performs the predictions for tissue types, binary cell segmentation, hori-

zontal and vertical distance maps, and nuclei type predictions for nuclei

classiﬁcation.

PanNuke

dataset

mPQ: 0.51

F1- score: 0.83

TSFD-Net

(Ilyas et al.,

2022)

Utilizes Tissue-Speciﬁc Feature Distillation (TSFD) to optimize nuclei

features, with multiple decoders enabling context-aware predictions.

PanNuke

dataset

mPQ: 0.504

bPQ: 0.6377

HoverNet

(Graham et al.,

2019)

U-Net-like architecture with ResNet50 backbone and 3 independent de-

coders based densely connected networks

Kumar; Con-

SeP; CPM-15;

CPM-17;

TNBC datasets

mPQ: 0.4629

bPQ: 0.6596

MicroNet

(Raza et al.,

2019)

Encoder with 4 branches of convolution, max-pooling, resizing, and

concatenation layers. The network consists of ﬁve groups and thirteen

branches, processing input from membrane and nuclear marker images.

Multiplexed

Fluorescence

Imaging Data ;

CPM Data

Dice: 82.43%

F1-Score: 71.79%

OD: 74.12%

Acc: 83.53%

OH: 27.53

HookNet (van

Rijthoven et al.,

2020)

Uses a U-Net backbone encoder and a modiﬁed decoder to extract con-

textual features (context branch) and ﬁne details (target branch) from

input patches, aligning feature maps via concatenation, with a hooking

mechanism integrating context into the target branch for enhanced seg-

mentation.

TCGA

—-

PointNu-Net

(Yao et al.,

2024)

Dynamic convolution for instance segmentation while feature and ker-

nel branches collaboratively generate instance segmentation,enhanced

by a Joint Pyramid Fusion.

PanNuke

dataset

mPQ: 0.4957

bPQ: 0.6808

Aggregated Jaccard Index (AJI); Multi-class Panoptic Quality (mPQ); Binary Panoptic Quality (bPQ); Dice coefﬁcient (Dice);

Intersection over Union (IoU) ; Object Dice (OD), Pixel Accuracy (Acc) ; Object Hausdorff(OH).

learning architecture on the PanNuke dataset, known

for its diversity and size in nucleus segmentation and

classiﬁcation. In order to prepare the dataset, we

implement preprocessing techniques to improve im-

age quality and ensure compatibility with model input

requirements, including checking neighboring pixels

and creating binary boundary images to extract rele-

vant features for segmentation and classiﬁcation. The

model is trained using a supervised learning frame-

work, optimized for accurate cell segmentation and

classiﬁcation. We also integrate boundary detection

and classiﬁcation branches to enhance the visual dis-

tinction and accuracy of segmented cells. By lever-

aging the capabilities of the large-scale connected

UNET (UNET3+), which incorporates full-scale skip

connections, we aim to efﬁciently capture ﬁne details

and coarse-grained semantics, addressing issues such

as over segmentation and false positives (Kumar et al.,

2020) (Rizzo et al., 2022) (Feng et al., 2021). A block

diagram (Figure 1) summarizing the different stages

includes data preprocessing, model training, segmen-

tation and classiﬁcation, visualization and analysis.

By integrating advanced preprocessing and bound-

ary detection techniques, our approach aims to sig-

niﬁcantly contribute to the advancement of automated

cancer detection, thereby improving patient outcomes

and healthcare delivery.

3.1 The Main Contributions

Our methodology represents a multi-task learning ap-

proach for cell segmentation, boundary detection, and

classiﬁcation using a deep learning architecture. The

pipeline starts with an input histological image, pre-

processed by applying hand crafted techniques to ex-

tract low-level features to prepare our dataset that

will be processed through an encoder-decoder-like

network. The encoder extracts features at multi-

ple scales, followed by full-scale skip connections to

transfer information across the network. Two par-

allel decoders, one for boundary detection and one

for segmentation, generate respective outputs. Both

branches feed into a classiﬁcation-guided module

SBC-UNet3+: Classiﬁcation of Nuclei in Histology Imaging Based on Multi Branch UNET3+ Segmentation Model

603

(CGM) that improves the ﬁnal cell classiﬁcation out-

put. We can summarize our contribution as follows:

3.1.1 Histological Images Preprocessing

Histological images contain various features that need

extraction for analysis. Our approach applies prepro-

cessing to focus on cell regions of interest, avoiding

the time-consuming task of processing the entire im-

age. A preprocessing algorithm generates binary cell

contours from a multi-channel mask by identifying

boundary pixels. It checks neighboring pixels for dif-

ferent labels to detect boundaries, labeling the corre-

sponding pixel in the binary output. This highlights

cell boundaries, aiding the detection and analysis of

cell shapes in subsequent steps.

3.1.2 Segmentation Branches

The segmentation part of our approach aims to use

the pre-processed boundaries and ground truth gener-

ated from the PanNuke Dataset to extract high-level

features and facilitate feature extraction and cancer

cell detection. For this, our segmentation model has

a multi-branch structure for multi-class segmentation

and boundary detection as detailed below:

- Multi-Class Cells Segmentation Branch. This

branch aims to predict if the pixel belongs to a cell

or to background. We need to deﬁne if the pixel be-

longs to a cell or rather to the background. This is a

fundamental process for the classiﬁcation task of the

next step.

- Binary Boundaries Detection Branch. The

boundary detection module is responsible for iden-

tifying edges or boundaries in cell image masks. It

identiﬁes edges in multichannel masks by compar-

ing each pixel to its neighboring pixels. This process

creates a binary boundary image that highlights the

edges, allowing for further cell boundaries analysis

and visualization. The key technique used is neigh-

bor comparison, which ensures that boundaries are

accurately detected based on differences in pixel la-

bels. A post-processing step is applied on the bound-

ary detection branch by integrating Gaussian smooth-

ing techniques to reﬁne boundary predictions. This

architecture integrates the Convolutional Block At-

tention Module (CBAM) into the decoders of a our

model to improve feature representation. CBAM uses

both channel-wise and spatial attention mechanisms:

channel-wise attention recalibrates feature maps by

emphasizing important channels, while spatial atten-

tion focuses on relevant spatial regions. By incorpo-

rating CBAM, decoders can more effectively high-

light important features and suppress less important

ones, which improves performance in segmentation

tasks. This adaptive attention mechanism helps the

model capture intricate details and improves over-

all accuracy. For the evaluation of our multi-branch

model, a custom loss function has been implemented.

The proposed network design includes two distinct

sets of weights: λ

seg

for segmentation loss and λ

bound

for boundary loss, which refer, successively, to the

weights of the segmentation branch decoder and the

boundary detection branch decoder. These weight

sets are jointly optimized using the loss L deﬁned as:

total

= λ

seg

+ λ

bound

where:

• L

seg

is the loss for the segmentation branch,

• L

bound

is the loss for the boundary detection

branch,

• λ

seg

and λ

bound

are the weighting factors for each

branch.

We have chosen to use two different loss func-

tions at the output of each branch for superior over-

all performance. We applied some custom parameters

on this loss function. Speciﬁcally, we set the bound-

ary coefﬁcient detection loss λ

bound

to 2 and the other

scalar λ

seg

of the segmentation branch to 0.7 based

on empirical selection, as presented in the algorithm

below (Algorithm 1). It performs a grid search to se-

lect the best weights for the segmentation and bound-

ary detection branches of a multi-branch cell segmen-

tation model using full-scale UNET3+. The process

begins by deﬁning a grid of possible weights for the

segmentation loss (λ

seg

) and boundary loss (λ

bound

These weights are combined to form a parameter grid

P, which includes all possible pairs of these weights.

For each pair of weights in the grid, the algorithm

calculates the total validation loss by ﬁrst comput-

ing the individual segmentation and boundary losses

in the validation dataset. The total validation loss

is then computed as a weighted sum of these two

losses, using the current weights pair. The algorithm

tracks the smallest total validation loss encountered

during the grid search. If a new weight combina-

tion results in a lower total validation loss than pre-

viously encountered, the new value is stored as the

best weight pair. To compute the ﬁnal loss, we adopt

a weighted sum method which combine the individ-

ual losses from each branch; multiclass segmentation

(Cross-Entropy) and binary boundary detection (Bi-

nary Cross- Entropy). The choice of weighted sum

allows us to manually control the relative contribu-

tion of each task to the overall model optimization,

ensuring a balanced learning process. This ﬂexibility

is crucial for multitask learning, where tasks such as

segmentation and boundary detection have different

VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications

604

Data: P: Grid of parameters for segmentation (λ

seg

) and

boundary (λ

bound

) loss weights.

S: Predicted segmentation outputs.

F: Predicted boundary outputs.

M: Ground truth segmentation masks.

B: Ground truth boundaries.

Result: λ

best

= (λ

seg best

, λ

bound best

): Best weights for

segmentation and boundary losses.

Initialization:;

best

← None; L

best

← ∞.;

Deﬁne weight grid P as follows:;

seg values

= [0.3, 0.5, 0.7, 1.0, 1.5, 2.0].;

bound values

= [0.3, 0.5, 0.7, 1.0, 1.5, 2.0].;

Construct P by combining all pairs (λ

seg

, λ

bound

) from

seg values

and λ

bound values

Begin Algorithm:;

for each p in P do

seg

← p[’weight segmentation loss’].;

bound

← p[’weight boundary loss’].;

Calculate segmentation loss across all samples

seg

←

len(S)

∑

len(S)

i=1

seg

(S[i], M[i]).;

Calculate boundary loss across all samples

bound

←

len(F)

∑

len(F)

i=1

bound

(F[i], B[i]).;

Calculate the total weighted loss

total

← λ

seg

× L

seg

+ λ

bound

× L

bound

if L

total

< L

best

then

best

← L

total

best

← (λ

seg

, λ

bound

).;

end

return λ

best

End Algorithm

Algorithm 1: Weight Selection using Grid Search.

scales and difﬁculty levels. By adjusting the weights,

we can prioritize one task over another, aligning the

training process with the speciﬁc needs of our prob-

lem. Other alternatives to the weighted sum approach

include using dynamic weighting strategies, such as

uncertainty weighting or loss normalization, where

the weights evolve during training based on the un-

certainty of the task or the magnitude of the loss.

While these methods can be effective in automatically

balancing tasks, they introduce additional complex-

ity to the model and may be less interpretable, which

can complicate tuning and analysis. Additionally, dy-

namic weighting does not always guarantee the de-

sired balance between tasks, which can lead to sub-

optimal performance in one or more branches if task-

speciﬁc losses are highly imbalanced. We opt for

a manually controlled weighted sum for its simplic-

ity, interpretability, and the direct control it provides

over the training dynamics. This personalized loss is

well suited to our application, where both tasks are

of equal importance and require ﬁne-tuned inputs to

ensure optimal overall performance.

3.1.3 Cells-Classiﬁcation Branch

The segmentation and boundary detection maps are

used as input features for the classiﬁcation branch as

they provide detailed spatial information about differ-

ent regions of the image. The Classiﬁcation Guided

Module (CGM) takes the segmentation maps and pro-

cesses them to create a fused feature representation.

This module can include operations such as convolu-

tion, attention mechanisms, and concatenation to efﬁ-

ciently combine the segmentation features and bound-

ary detection features as presented in Figure 1.

The classiﬁcation branch of the model is designed

to exploit the rich spatial information provided by the

segmentation and boundary detection results from the

previous segmentation step. Thus, the input features

provide detailed spatial information about different

regions of the image, which is essential for accurate

classiﬁcation.

The CGM takes these segmentation maps and pro-

cesses them to create a fused feature representation.

Indeed, this module uses several operations: Convolu-

tional Layers (Conv2d), Batch Normalization (Batch-

Norm2d), ReLU Activation and Concatenationlayers

where the CGM combines the features from the seg-

mentation and boundary detection maps, creating a

rich and informative feature set for the ﬁnal classiﬁ-

cation task. Finally, and once the fusion function has

been processed, the features are passed to additional

layers that further reﬁne the features. The result is

then used to classify the cells into different categories,

thus providing the ﬁnal classiﬁcation result.

In summary, the proposed architecture is deﬁned

as a multi-task learning model for cell segmentation

and boundary detection using the UNET3+ backbone

and a classiﬁcation-guided module (CGM) using a

CNN architecture. The complete architecture is pre-

sented in the diagram below (Figure 1).

4 DATA AND EXPERIMENTAL

RESULTS

In this section, we investigate the evaluation of perfor-

mance of our proposed approach applied on the Pan-

Nuke dataset.

Models were implemented using PyTorch 1.8.1

and trained on an NVIDIA TITAN XP GPU. The loss

functions used were Cross Entropy (CE) for the seg-

mentation branch and Binary Cross Entropy (BCE)

for the boundary detection branch. Training was per-

formed with a batch size of 32 over 150 epochs, us-

ing the Adam optimizer with a learning rate of 0.001.

The models were designed to handle 6 classes for seg-

mentation and 1 class for boundary detection, with

input images resized to 256x256 pixels. Large-scale

images used in advanced analysis can present unique

challenges due to their complex structure and large

SBC-UNet3+: Classiﬁcation of Nuclei in Histology Imaging Based on Multi Branch UNET3+ Segmentation Model

605

Figure 1: Representation of the multi-task architecture for multi-class segmentation, boundary detection, and classiﬁcation of

cells in histological images.

data volume. The composition of these images of-

ten spans across six distinct dimensions, incorporat-

ing layers that contain complex information. In some

cases, these large images contain multi-scene data,

where each scene comprises a series of images repre-

senting different time points. This multidimensional-

ity increases the computational load, leading to longer

training times and slower model inference. To efﬁ-

ciently process such complex images, deep learning

models must be carefully designed and optimized to

handle the large data requirements and complexities

inherent in compositing images in six dimensions.

4.1 The Used Database

To effectively evaluate and compare our model, we

used the PanNuke Dataset, a semi-automatically cu-

rated dataset speciﬁcally designed for nucleus in-

stance segmentation and classiﬁcation. This dataset

offers comprehensive nuclei annotations across 19 tis-

sue types, including adrenal gland, bile-duct, bladder,

breast, cervix, colon, esophagus, headneck, kidney,

liver, lung, ovarian, pancreatic, prostate, skin, stom-

ach, testis, thyroid, and uterus. By leveraging this

dataset, we can ensure that our models are exposed to

a wide variety of tissue types and nuclei appearances,

enabling robust training and validation.

Figure 2: Illustration of various histological images from

PanNuke dataset (256x256 shaped samples).

4.2 Quantitative Evaluation

During evaluation, we assess the model’s perfor-

mance using metrics like pixel accuracy, IoU, mPQ,

and bPQ to evaluate its effectiveness in cell segmen-

tation. Segmentation results are visualized for quali-

tative analysis to identify potential issues. The choice

of metrics depends on the task’s requirements and

dataset characteristics.

Metrics should handle multiple classes and pro-

vide insights both individually and collectively. mPQ

evaluates multi-class segmentation performance by

averaging individual class bPQ values, while bPQ is

used for binary segmentation tasks (e.g., object vs.

background).

While mPQ and bPQ are useful for instance seg-

mentation, metrics like IoU, Dice coefﬁcient, preci-

sion, recall, and mIoU can be adapted for multi-class

semantic segmentation. IoU measures model accu-

racy by calculating the ratio of intersection and union

of the predicted and ground truth masks.

mPQ =

|C|

∑

c∈C

+0.5×(FP

+FN

)

;bPQ =

TP+

(FP+FN)

Table 2 presents our model’s performance on 19

tissue types for cell segmentation and boundary de-

tection. It achieved a high mean accuracy of 89.61%

and F1-Score of 89.19%. Pancreatic tissue showed

the highest accuracy (95.18%), while Prostate tissue

had the lowest (80.31%). The mean IoU (83.16%)

and Recall (85.94%) demonstrate the model’s effec-

tiveness in segmenting nuclei. With a low average

segmentation loss (0.29) and an mPQ score of 42.62,

the method performs well across various tissue types.

The classiﬁcation branch (Table 2) shows robust

performance with an average accuracy of 85.86%.

The highest accuracy is in pancreatic tissue (95.06%),

while Cervix tissue has the lowest (73.51%), suggest-

ing challenges in handling its speciﬁc features.

VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications

606

Table 2: Cells Segmentation, Boundary detection and classiﬁcation results.

Cell types

Mean values

Adrenal-gland

Bile-duct

Bladder

Breast

Cervix

Colon

Esophagus

HeadNeck

Kidney

Liver

Lung

Ovarian

Pancreatic

Prostate

Skin

Stomach

Testis

Thyroid

Multi-branch Accuracy 94.25 93.20 85.08 90.87 88.63 83.04 92.21 87.76 92.10 92.88 89.00 88.97 95.18 80.31 90.79 91.30 93.19 88.78 89.61

Multi-branch Loss 0.16 0.21 0.67 0.22 0.35 0.40 0.23 0.38 0.23 0.22 0.29 0.29 0.14 0.27 0.50 0.25 0.24 0.20 0.29

Mean IoU 90.40 88.19 74.65 84.91 80.83 73.49 85.56 80.97 86.25 87.89 81.16 80.84 91.91 84.67 68.65 84.82 85.77 88.04 83.16

F1-Score 93.99 92.18 82.94 90.12 86.71 83.40 92.21 86.54 91.24 92.56 88.26 88.59 95.18 91.08 78.57 90.77 92.55 87.25 89.29

Recall 94.25 93.20 85.08 90.87 88.63 83.04 92.21 87.76 92.10 92.88 88.99 88.98 95.18 91.01 80.31 90.79 91.30 93.19 89.99

mPQ 44.7 42.49 32.74 48.78 31.49 53.19 44.68 35.99 38.8 46.30 41.91 57.05 37.23 35.16 31.24 49.49 54.86 44.92 42.62

bPQ 57.30 55.75 57.61 41.31 50.64 33.14 50.59 48.72 50.39 57.23 54.77 50.15 57.48 44.47 28.89 59.22 55.17 52.08 56.33

Classiﬁcation Accuracy 94.12 91.86 79.49 85.39 73.51 74.99 87.01 85.53 88.32 90.73 87.64 82.18 95.06 86.46 76.00 86.74 87.05 92.22 85.86

F1-Score 93.04 89.54 71.89 80.77 74.68 66.24 84.50 81.27 85.33 88.28 85.69 79.59 94.37 86.20 70.21 83.06 83.42 89.95 82.74

Recall 94.12 91.87 79.49 85.39 73.52 74.99 87.01 85.53 88.32 90.73 87.64 82.18 95.06 86.46 76.00 86.74 87.05 92.22 81.96

The F1-Score and recall measures align with ac-

curacy, conﬁrming the model’s reliability in identi-

fying and classifying cells. However, lower perfor-

mance in tissues like skin indicates areas for model

improvement. Overall, the classiﬁcation branch inte-

grates segmentation and boundary detection features

for accurate, context-aware cell classiﬁcation across

most tissue types.

4.3 Performance Comparison with

State-of-Art Approaches

We compare our approach to state-of-the-art models

(PSI-Net, HoVer-Net, and TSFD-Net) based on preci-

sion, F1-score, and recall, as shown in Tables 3 and 4.

Our model demonstrates superior generalization with

high median precision and a balanced distribution,

avoiding the overﬁtting seen in PSI-Net. It achieves

tighter and more consistent F1-scores than TSFD-Net

and higher recall than both PSI-Net and TSFD-Net,

while matching HoVer-Net’s performance. These re-

sults highlight the model’s robust and well-balanced

performance across key metrics.

Table 4 shows box plots visualizing the perfor-

mance metrics of each model across accuracy, F1-

score, and recall. Each box represents the interquar-

tile range (IQR), containing the middle 50% of the

data, with the median score indicated by the line in-

side the box. Whiskers extend up to 1.5 times the

IQR, with points outside representing outliers. The

median accuracy and F1-score are around 0.92, with

most values between 0.90 and 0.94, while recall has

a slightly lower median of 0.91, with scores ranging

from 0.89 to 0.93. A few outliers suggest variability

across cases. Overall, the model performs well across

metrics, with consistent results and potential for re-

ﬁnement in speciﬁc cases.

4.4 Qualitative Evaluation

Qualitative evaluation, shown in Figure 3, demon-

strates our approach’s effectiveness in segmenting

and classifying cell types in histological images.

The original images with ground truth and predicted

boundaries highlight the model’s ability to accurately

detect and delineate cell boundaries. The colored

boundaries indicate successful identiﬁcation and clas-

siﬁcation of cell types, including neoplastic, non-

neoplastic epithelial, and inﬂammatory cells. The

model handles overlapping and irregularly shaped

cells with precision. The qualitative results validate

the model’s strength in providing accurate segmen-

tation for histological analysis and cancer diagno-

sis.In this ﬁgure, we present a qualitative compari-

son of our approach with several related works on

the PanNuke dataset. Subﬁgure (a) shows the origi-

nal histological images, while subﬁgure (b) illustrates

the pre-segmented ground truth for different tissue

types (e.g., Bile-duct, Prostate, Kidney). Subﬁgure

duced by our approach, accurately differentiating cell

types and structures. Subﬁgure (d) represents the pre-

processed boundary masks we generated, followed by

subﬁgure (e), which shows the binary boundary seg-

mentation prediction for the respective tissues. Fi-

nally, subﬁgure (f) illustrates the ﬁnal results after

applying the Classiﬁcation Guided Module (CGM),

which fuses the segmentation and boundary detection

results and overlays colored contours on the classiﬁed

cells for visualization purposes. This comprehensive

illustration provides insights into how our method

outperforms traditional approaches, improving both

segmentation and boundary detection outcomes.

4.5 Results and Discussion

Our proposed approach SBC-UNet3+ has achieved

outstanding performance in key metrics compared

SBC-UNet3+: Classiﬁcation of Nuclei in Histology Imaging Based on Multi Branch UNET3+ Segmentation Model

607

Table 3: Comparison with cutting-edge approaches on PanNuke dataset.

Ref. Accuracy mPQ bPQ Dice coefﬁcient Mean IoU Recall F1-Score

(Ilyas et al., 2022) TSFD-Net (2022) 84.76 52.67 44.0 70.0 34.08 84.76 78.19

(Murugesan et al., 2019) Psi-Net (2019) 85.05 34.76 83.56 54.23 42.15 47.71 49.55

(Graham et al., 2019) HoVer-Net (2019) 87.78 32.71 43.60 61.03 47.53 58.81 61.03

SBC-UNet3+ 89.61 42.62 50.91 99.08 83.16 85.94 89.19

Table 4: Box Plots of Model Performance Metrics: Accuracy, F1 Score, and Recall.

PsiNet HoVerNet TSFD-Net Our SBC-UNET3+

Figure 3: Evaluation of our approach on the PanNuke Dataset compared to related works. (a) Input images, (b) Ground truth,

(c) Segmentation results, (d) preprocessed Boundary masks, (e) Binary boundary segmentation, (f) CGM results with colored

contours, (g) HoverNet results (Graham et al., 2019), (h) Psi-Net results (Murugesan et al., 2019), (i) and (j) TSFD-Net

segmentation and boundary results (Ilyas et al., 2022).

to existing state-of-the-art approaches. Indeed, it

achieves a high recall value of 85.94% and also F1-

Score value of 89.19%, which are superior to those of

HoVer-Net, a leading approach in the ﬁeld. Moreover,

the accuracy of our model reaches 89.61%, proving

its robust ability to correctly identify and segment

cells. Notably, the dice coefﬁcient of 99.08% and

IoU of 83.16% highlight the accuracy of our boundary

delineation and segmentation processes. However,

our model produces lower mPQ and bPQ values of

42.62% and 50.91%, respectively, compared to other

metrics, indicating a limitation in using these panop-

tic metrics for evaluation in this speciﬁc domain.

However, histological cell segmentation involves

unique challenges, such as the need for precise bound-

ary delineation, handling overlapping cells, and dis-

tinguishing subtle class differences, making panoptic

quality less suitable as an evaluation metric. Met-

rics such as dice coefﬁcient, boundary-speciﬁc IoU,

or other boundary-speciﬁc measures are often more

appropriate to capture the nuances needed for histo-

logical analysis. In histological cell segmentation, in-

dividual cells can be very small, irregularly shaped,

and often overlap, creating complex and ambiguous

boundaries. This complexity challenges the assump-

tion of distinct and well-separated objects inherent in

the PQ metric.

The fact that PQ relies on IoU rather than direct

boundary accuracy makes it less sensitive to the types

of errors that are critical in histology. For instance, a

small error in boundary detection could cause a sub-

stantial drop in IoU and, consequently, PQ scores.

More importantly, such errors can lead to incorrect

biological interpretations, undermining the reliability

of the analysis. Therefore, while PQ is a robust met-

ric for certain applications, its limitations in handling

overlapping and closely packed instances with subtle

class differences make it less suitable for evaluating

histological cell segmentation models. Our approach

addresses these challenges effectively, as evidenced

by the superior performance in boundary-sensitive

metrics, demonstrating its suitability and robustness

for this specialized task.

VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications

608

5 CONCLUSION AND FUTURE

WORK

In conclusion, we introduced SBC-UNet3+, a cell

segmentation and classiﬁcation model for histologi-

cal images. Our model surpasses existing methods in

segmentation, boundary detection, and classiﬁcation

by utilizing full-scale skip connections and Convolu-

tional Block Attention Module (CBAM) mechanisms,

ensuring accurate segmentation and enhanced bound-

ary delineation. This is crucial for capturing morpho-

logical details and differentiating overlapping cells,

which is vital for histopathological diagnosis. Future

work will explore integrating graph-based techniques

to improve tissue analysis, using probabilistic models

to reﬁne graph accuracy and feature representation,

which could offer deeper insights into tissue structure

and phenotypic relationships, advancing medical im-

age analysis and cancer diagnosis.

REFERENCES

Chen, K., Zhang, N., Powers, L., and Roveda, J. (2019).

Cell nuclei detection and segmentation for computa-

tional pathology using deep learning. In 2019 Spring

Simulation Conference (SpringSim), pages 1–6.

Chen, S., Ding, C., Liu, M., Cheng, J., and Tao, D. (2023).

Cpp-net: Context-aware polygon proposal network

for nucleus segmentation. IEEE Transactions on Im-

age Processing, 32:980–994.

Cutler, K. J., Stringer, C., Wiggins, P. A., and

Mougous, J. D. (2022). Omnipose: a high-precision

morphology-independent solution for bacterial cell

segmentation. bioRxiv.

Feng, Z., Wang, Z., Wang, X., Mao, Y., Li, T., Lei, J., Wang,

Y., and Song, M. (2021). Mutual-complementing

framework for nuclei detection and segmentation in

pathology image. In 2021 IEEE/CVF International

Conference on Computer Vision (ICCV), pages 4016–

4025.

Gamper, J., Koohbanani, N. A., Benes, K., Graham, S., Ja-

hanifar, M., Khurram, S. A., Azam, A., Hewitt, K.,

and Rajpoot, N. (2020). Pannuke dataset extension,

insights and baselines.

Graham, S., Vu, Q. D., Raza, S. E. A., Azam, A., Tsang,

Y. W., Kwak, J. T., and Rajpoot, N. (2019). Hover-

net: Simultaneous segmentation and classiﬁcation of

nuclei in multi-tissue histology images. Medical Im-

age Analysis, 58:101563.

orst, F., Rempe, M., Heine, L., Seibold, C., Keyl, J., Bal-

dini, G., Ugurel, S., Siveke, J., Gr

unwald, B., Egger,

J., and Kleesiek, J. (2024). Cellvit: Vision transform-

ers for precise cell segmentation and classiﬁcation.

Medical Image Analysis, 94:103143.

Ibtehaz, N. and Rahman, M. S. (2020). Multiresunet

: Rethinking the u-net architecture for multimodal

biomedical image segmentation. Neural Networks,

121:74–87.

Ilyas, T., Mannan, Z. I., Khan, A., Azam, S., Kim, H., and

De Boer, F. (2022). Tsfd-net: Tissue speciﬁc feature

distillation network for nuclei segmentation and clas-

siﬁcation. Neural Networks, 151:1–15.

Im, Y.-H., Park, S.-H., and Lee, S.-C. (2024). Hda-net: H&e

and rgb dual attention network for nuclei instance seg-

mentation. IEEE Access, 12:56622–56632.

Khuriwal, N. and Mishra, N. (2018). Breast cancer detec-

tion from histopathological images using deep learn-

ing. In 2018 3rd International Conference and Work-

shops on Recent Advances and Innovations in Engi-

neering (ICRAIE), pages 1–4.

Kumar, N., Gupta, R., and Gupta, S. (2020). Whole slide

imaging (wsi) in pathology: Current perspectives and

future directions. Journal of Digital Imaging, 33.

Murugesan, B., Sarveswaran, K., Shankaranarayana, S. M.,

Ram, K., and Sivaprakasam, M. (2019). Psi-net:

Shape and boundary aware joint multi-task deep

network for medical image segmentation. CoRR,

abs/1902.04099.

Raza, S. E. A., Cheung, L., Shaban, M., Graham, S., Ep-

stein, D., Pelengaris, S., Khan, M., and Rajpoot, N. M.

(2019). Micro-net: A uniﬁed model for segmentation

of various objects in microscopy images. Medical Im-

age Analysis, 52:160–173.

Rizzo, P., Girolami, I., Marletta, S., Pantanowitz, L., An-

tonini, P., Brunelli, M., Santonicco, N., Vacca, P., Tu-

mino, N., Moretta, L., Parwani, A., Satturwar, S., Ec-

cher, A., and Munari, E. (2022). Technical and diag-

nostic issues in whole slide imaging published vali-

dation studies. Frontiers in Oncology, 12. Publisher

letta, Pantanowitz, Antonini, Brunelli, Santonicco,

Vacca, Tumino, Moretta, Parwani, Satturwar, Eccher

and Munari.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net:

Convolutional networks for biomedical image seg-

mentation. volume 9351, pages 234–241.

Stringer, C., Michaelos, M., and Pachitariu, M. (2020).

Cellpose: a generalist algorithm for cellular segmen-

tation. bioRxiv.

van Rijthoven, M., Balkenhol, M., Silin¸a, K., van der Laak,

J., and Ciompi, F. (2020). Hooknet: multi-resolution

convolutional neural networks for semantic segmenta-

tion in histopathology whole-slide images.

Yao, K., Huang, K., Sun, J., and Hussain, A. (2024).

Pointnu-net: Keypoint-assisted convolutional neural

network for simultaneous multi-tissue histology nu-

clei segmentation and classiﬁcation. IEEE Transac-

tions on Emerging Topics in Computational Intelli-

gence, 8(1):802–813.

Zhao, Y., Yang, F., Fang, Y., Liu, H., Zhou, N., Zhang,

J., Sun, J., Yang, S., Menze, B., Fan, X., and Yao,

J. (2020). Predicting lymph node metastasis us-

ing histopathological images based on multiple in-

stance learning with deep graph convolution. In 2020

IEEE/CVF Conference on Computer Vision and Pat-

tern Recognition (CVPR), pages 4836–4845.

SBC-UNet3+: Classiﬁcation of Nuclei in Histology Imaging Based on Multi Branch UNET3+ Segmentation Model

609