variability, both inter and intra observer (A. Young
and Kerr, 2011). Hence, there are growing efforts
towards the development of computer-aided diagnos-
tic techniques, with two major directions: (i) auto-
mated segmentation, aimed at partitioning the hetero-
geneous colorectal samples into homogeneous (i.e.
containing only one type of tissue) regions of inter-
est. (ii) automated classification, aimed at categoris-
ing the homogeneous tissue regions into a number of
classes, either normal or malignant, based upon quan-
titive features extracted from the image. In both the
tasks, the main challenge to be tackled is the extreme
intra-class and inter-dataset variability, that is an in-
herent characteristic of histological imaging. In this
work, we focus on the automated classification task,
and specifically into three histological categories that
are most relevant for CRC diagnosis: (i) healthy tis-
sue, (ii) adenocarcinoma, (iii) tubulovillous adenoma.
In the last few years, the literature on automated
classification of histological images has been exten-
sive, with applications covering different anatomical
parts other than colon, such as brain, breast, prostate
and lungs. Most of the proposed approaches rely on
automated texture analysis, where a limited set of
local descriptors are computed from patches of the
original input images and then fed into a classifier.
Among the most frequently used, statistical features
based on grey level co-occurrence matrix (GLCM),
local binary patterns (LBP), Gabor and wavelet trans-
forms, etc. The texture descriptors, eventually en-
coded into a compact dictionary of visual words, are
used as input of machine learning techniques such
as Support Vector Machines (SVM), Random Forests
or Logistic Regression classifiers (Di Cataldo and Fi-
carra, 2017). In spite of the good level of accuracy
obtained by some of these works, the dependence on
a fixed set of handcrafted features is a major limita-
tion to the robustness of the classical texture analysis
approaches. First, because it requires a deep knowl-
edge of the image characteristics that are best suited
for classification, which is not obvious. Second, be-
cause it puts severe constraints to the generalisation
and transfer capabilities of the proposed classifiers,
especially in presence of inter-dataset variability.
As an answer to such limitations, in the recent
years the use of deep learning (DL) architectures,
and more specifically Convolutional Neural Networks
(CNNs), has become a major trend (Janowczyk and
Madabhushi, 2016; Korbar et al., 2017). In CNNs
a number of convolutional and pooling layers learns
by backpropagation a set of features that are best for
classification, thus avoiding the extraction of hand-
crafted texture descriptors. Nonetheless, the necessity
of training the networks with a huge number of in-
dependent histological samples is still an open issue,
which limits the usability of the approach in the ev-
eryday clinical setting. Transfer learning (i.e applying
CNNs pre-trained on a different type of images, for
which large datasets are available) seems a promising
solution to this problem (Weiss et al., 2016) but not
fully investigated for CRC classification.
In this work. we evaluate a CNN-based ap-
proach to automatically differentiate healthy tissues
and tubulovillous adenomas from cancerous samples,
which is a challenging task in histological image anal-
ysis. For this purpose, we fully train a CNN on a large
set of colorectal samples, and assess its accuracy on
an independent test set. This technique is experimen-
tally compared with two different transfer learning
approaches, both leveraging upon a CNN pre-trained
on a completely different image dataset. The first ap-
proach uses the pre-trained CNN to extract a set of
discriminative features that will be fed into a sepa-
rate Support Vector Machines classifier. The second
approach fine-tunes on CRC histological images only
the last stages of the pre-trained CNN. By doing so,
we investigate and discuss the transfer learning capa-
bilities of CNNs in the domain of colorectal tissues
classification.
2 MATERIALS AND METHODS
2.1 Colorectal Cancer Image Dataset
The dataset used in this study was extracted from
a public repository of H&E stained whole-slide im-
ages (WSIs) of colorectal tissues, available on line
at http://www.virtualpathology.leeds.ac.uk/. All the
slides are freely available for research purposes, to-
gether with their anonymised clinical information.
In order to obtain a statistically significant dataset
in terms of inter-subjects and inter-class variability,
27 WSIs were selected among univocal subjects (i.e.
one WSI per patient). Note that different types of tis-
sues (e.g. healthy and cancerous portions) coexist in a
single WSI. With the supervision of a skilled pathol-
ogist, we identified large regions of interest (ROIs)
on the WSIs as in the example of Figure 3, so that
each ROI is univocally associated to one out of the
three tissue subtypes: (i) adenocarcinoma (AC); (ii)
tubuvillous adenoma (TV) and (iii) healthy tissue (H).
Then, the ROIs were cropped into a total number of
13500 1089x1089 patches (500 per patient), at a mag-
nification level of 40x.
For training and testing purposes, the original
image cohort was randomly split into two disjoint
subsets, comprising 18 subjects for training (9000
Colorectal Cancer Classification using Deep Convolutional Networks - An Experimental Study
59