A fundamental problem in developing a screening
system is the vast areas that need to be analysed. A
regular PAP-smear covers an area of at least 25x50
mm. The resolution needed for determining the ma-
lignancy of a cell leads to a pixel size of around 0,2
microns. This translates to 31 billion pixels on a spec-
imen. Handling this amount of data in a few minutes
poses serious challenges both on the initial scanning
side and on the subsequent data analysis side. One
way of improving the situation is to use a modified
technique for depositing the slides on the specimen.
So called Liquid Based Preparations, LBP, typically
deposit the material in a circle with a diameter of 20
mm. This reduces the number of pixels to around 8
billion, still a substantial number, at the cost of a sub-
stantially more complex and costly slide preparation
procedure. For the final analysis of a cell to be reliable
it has to be in perfect focus and the algorithms to ex-
tract the relevant features are typically quite elaborate
and thus time consuming. Autofocus and complex
analysis algorithms thus make the automated screen-
ing problem even more challenging.
One way to attack this difficulty is to have a two
stage approach, an initial search phase for areas or
cells of interest followed by a detailed analysis of
the interesting regions. This approach was first sug-
gested and analyzed by Poulsen (Poulsen, 1973) and
later implemented in the Diascan system (Nordin,
1989). There have been huge improvements in scan-
ning and computer technology since the 1970-80 ies
when these projects were conducted but the funda-
mental problem holds. We thus need to find efficient
ways of determining where on the slide we should fo-
cus our attention to reach a reliable decision about
whether the specimen is normal or possibly show
some abnormalities.
The initial analysis can be conducted of fields of
view of lower resolution and with less stringent re-
quirement on perfect focus. Whether these fields are
obtained by merging pixels or subsampling images
scanned at full resolution or by a separate scan of
the specimen with different optics is a technical is-
sue that requires a complicated technical/economical
analysis to find the best solution for a particular set-
ting. We will not discuss those issues further in this
paper. For the study in this paper we have worked
with images with a pixel size of 0.5 microns and with
a single rough focus setting. This represents between
one and two orders of magnitude less data than the
perfectly focused, high resolution images needed for
the final analysis.
The task of this low resolution analysis is to find
areas that should be analysed more in detail. This
will trivially mean to discard completely empty ar-
eas or areas where the cells are spread so dense that
they cannot be resolved. We will be looking for areas
with suitable density of cells of potential diagnostic
interest. This could be extended to only look for cells
that are larger than normal, since malignant cells usu-
ally are larger than normal ones. But stretching this
criterion too far risks leading to missing some spec-
imens where the malignant cells are of normal size
(such malignancies exists). So we will be counting
cells that are of a relevant size for further analysis,
not only cells significantly bigger than normal.
Another important task is to look for clusters. It is
known that malignant cells tend to cluster more than
normal ones so when the human screener see a clus-
ter of cells they take an extra look. We should thus
note and flag the appearance of clusters in the anal-
ysed fields.
So to summarize we will in this paper present
a study of image fields of moderate resolution from
standard PAP smears and LBP specimens generating
data that can be used to prioritize which areas should
be used for the subsequent more expensive high reso-
lution analysis. Thus optimizing the overall through-
put of a system without sacrificing detection qual-
ity. The methods described in this paper can also
work towards the overall classification task by locat-
ing diagnostically important structures that are often
overlooked in conventional cell by cell classification
schemes. We have not found any studies in the re-
cent literature with this goal, most papers on PAP-
smear analysis deal with segmentation or classifica-
tion problems of images at a single resolution level.
However, Raymond et al. (Raymond et al., 1993)
made use of graphs and mathematical morphology
to analyse neighbourhood relationship between cells
in the study of germinal centers. Also, in a recent
publication Chandran et. al (Chandran et al., 2012)
presented a method for detecting clusters in cervical
smears that is of interest and that is used for compari-
son in this paper.
2 MATERIALS AND METHODS
2.1 Microscope Setup
The images were acquired using an Olympus BX51
optical microscope with a 20X, 0.75 NA objective and
a Hamamatsu ORCA 05G monochrome digital cam-
era providing images of 1344 x 1024 with an effective
square pixel size of 0.5 microns. The illumination was
filtered through a narrow green filter centered at 570
nm in order to optimize nuclear contrast.
ICPRAM2013-InternationalConferenceonPatternRecognitionApplicationsandMethods
356