metric object, 1 for the roughness, and 3 for a fraction
of incident light reflected from the surface. They used
StyleGAN 2 as the baseline architecture.
Specialized and modified GAN architectures can
have enough learning capacity to generate histolog-
ical data. A PathologyGAN (Quiros et al., 2019)
focuses on generating realistic histological images.
The variability of data is introduced from two dif-
ferent training datasets, H&E colorectal cancer tissue
from the National Cancer Center (NCT, Germany),
H&E breast cancer tissue from the Netherlands Can-
cer Institute (NKI, Netherlands), and Vancouver Gen-
eral Hospital (VGH, Canada). In total, it contains 86
whole slide images and 576 tissue microarrays. They
used BigGAN as the underlying architecture, which
they augmented with a mapping network from Style-
GAN, a style mixing regularization, and a relativistic
mean as a loss function for the discriminator.
StyleGAN is also used for prostate cancer data
synthesis (Daroach et al., 2022). However, the main
focus is on the trained latent space of the StyleGAN
to label the PCa regions according to the pathologist
annotations. The pathologist attached a label to each
of the model-generated realistically-looking patches.
These labels then defined the regions in the original
latent space from which sampled noise-generated his-
tology images were always of the latent-space class.
Therefore the StyleGAN-based solution is able to
synthesize sample patches of specified prostate can-
cer classes. However, they still required help from
a pathologist to annotate generated patches without
further medical information about the sample, which
may have introduced an error.
This paper presents a StyleGAN-based solution
for selectively synthesizing epithelial cells, lympho-
cytes, macrophages, and neutrophils in the lungs,
prostate, kidney, and breast. The result of our model
is an RGB image with a segmentation map of cells’
pixel positions and classes.
3 METHOD
The main goal of our method is to generate quality
histology images with associated cell multi-class seg-
mentation masks. GAN is the current, massively ap-
plied deep learning architecture framework suitable
for this problem. According to the related work,
GANs can generate non-stationary textures, medi-
cally valid histological data, related maps, and anno-
tated segmentation masks. We based our generator
architecture on StyleGAN.
Initial data are necessary to train the generator, so
we chose the MoNuSAC dataset (Verma et al., 2020).
It contains TMA images with their annotated segmen-
tation masks. The dataset consists of 4 cell types re-
sponsible for diagnosing stages and severity of lung,
prostate, breast, and kidney cancer. Each segmenta-
tion mask contains information about the classes of
cells and the organ. We use their initial color classes
of the cells: red, yellow, green, and blue for epithelial,
lymphocytes, macrophages, and neutrophils.
To validate the results and investigate the influ-
ence of the synthetic data used on training for seg-
mentation, we employed the standard segmentation
network UNet.
3.1 Generative Model
The tissue visuals depend on the organ, so the syn-
thesis method must preserve its tissue characteristics.
The standard input for the GAN network is sampled
Gaussian noise. Therefore, we extended the Style-
GAN architecture with an idea from Auxiliary Clas-
sifier GAN (ACGAN) (Odena et al., 2017). The or-
gan class is global information we represent by a one-
hot encoded vector, which sets the generator for the
intended organ visual. The cell classes are specific
to the location in the tissue. We do not pre-set the
segmentation mask defining the location of cells. We
use only the one-hot encoded vector to specify the ex-
pected classes the model should generate. To force the
generator to synthesize only specified classes is the
job of the discriminator. Also, to preserve the input
information about classes, we modify the ACGAN
approach and add the cell and organ information to
every 2n layer of the StyleGAN mapping network as
is shown In Figure 1. The result of the mapping net-
work is the style vector used in adaptive instance nor-
malization in generator layers. The generator archi-
tecture, random noise vector, constant vector of ones,
and blending alpha values for progressive growing are
the same as in the original StyleGAN paper.
The generator’s output and the discriminator’s in-
put is an image with 6 channels (2x RGB). The first
three channels present the generated histological im-
age, and the last three channels present the generated
segmentation mask of that image using the dataset’s
predefined colors per class. We need to modify the
discriminator to force the generator to train according
to the input class information. The standard regres-
sion is to distinguish real and fake images. Improved
Wasserstein loss is applied to reduce the chances of
Mode Collapse. The discriminator now requires also
two additional classifiers. One classifier classifies the
organ type - the class of the whole tissue segment,
which is activated by softmax on the output layer and
trained against multi-class categorical cross-entropy.
Synthesis for Dataset Augmentation of HE Stained Images with Semantic Segmentation Masks
875