haviours, please consider data presented in Figures 1
and 2, where some images, from the MICCAI2016
dataset, collected by different scanners are reported
for all the imaging modalities, both before (Figure
1) and after preprocessing (Figure 2). For the same
images, an horizontal line of data (red line) is also
plotted below (Figure 1b and Figure 2b). As can
be noted, unpreprocessed data show relevant differ-
ences between scanners (though data allowed to dif-
ferent patients, it is clearly visible the ratio between
the amplitude of different tissues in the same image
are different for the two scanners, as it is also con-
firmed by comparing the image corresponding to the
same imaging sequences): these differences, which
distinguish MRI from CT (where images from differ-
ent scanners are scalable in amplitude and easily com-
pared), are due to different imaging parameters opti-
mization by different manufacturers, though using the
same imaging sequences.
In Figure 2, the situation after preprocessing, an
amplitude normalization between different images
has occurred. In fact, the images of different scan-
ners are more similar than those before preprocess-
ing. However, from Figure 2b it can be observed
that the preprocessing step produced a variation on
the baseline of some of the images (the signal out-
side the brain, which should be zero, has a level well
above zero). Moreover, each image was normalized
independently from the other: this implied a modi-
fication which has been different from one image to
the other, thus introducing substantial differences also
on data from the same scanner. Finally, the ampli-
tude ratio between different tissues in the same im-
age has not been rightly corrected and, in some cases,
differences between data coming from different scan-
ners were increased. This is probably the reason why
some automatic strategies, though using preproessed
data, performed worse than those using original, un-
preprocessed, data. Finally, from both Figure 1 and
2, it can be noted that the information carried on by
different imaging modalities regarding MS lesions is
completely different: iperintense regions on FLAIR
images which are also iperintense on the correspond-
ing T2-w images surely indicate MS lesions (Filippi
et al., 2019). The other imaging modalities (T1-w and
PD) do not add anything more and, often, their con-
tent is confusing and not clearly interpretable (as in
the MS lesions indicated by the green arrows, both in
Figure 1 and Figure 2).
Form the above considerations, the following
guidelines could be derived:
1. The training of the method should be done on data
from a single scanner (also humans adapt to the
scanner they normally use): when data from dif-
ferent scanners need to be interpreted and, may
be, compared, the system has to be trained sepa-
rately to each scanner (in this way, the training set
can be reduced, the procedure shortened and the
performance increased);
2. A preprocessing strategy, consisting in the rigid
registration of each modality on the FLAIR im-
age, is necessary to obtain images of differ-
ent modalities which are spatially correspondent.
Other forms of preprocessing, especially those
consisting in amplitude corrections, have to be
performed on the whole volume and not differ-
ently on each single slice. Moreover, preprocess-
ing has to become part of the automatic segmen-
tation method;
3. The image modalities to be used in the identifi-
cation/segmentation process have to be chosen in
advance to avoid useless/confusing information,
unjustified increment of the training dataset, con-
vergence deceleration and performance reduction
(FLAIR and T2-w images are sufficient).
In what follows, we show how, by applying
the previously defined guidelines, it is possible to
improve the performance of a lesion segmentation
method.
4 MS LESION IDENTIFICATION/
SEGMENTATION
Being a benchmark method, we have used the super-
vised CNN-based paradigm presented in (Valverde
et al., 2017) that has also been used, in a modified
version, in (Placidi et al., 2019). In particular, by fol-
lowing the previously defined guidelines, we operated
the following choices:
1. the dataset used for training, validation and test
was the MICCAI2016 dataset but just using data
from a single 3T scanner (Philips manufacturer);
2. raw, unpreprocessed, data were preprocessed by
performing rigid registration of each modality on
the FLAIR image followed by brain extraction
(skull stripping) from T1-w image and applied to
other modalities;
3. only FLAIR and T2-w imaging modalities were
used for identification/segmentation. In this way,
we provided a simpler task to the system, thus
reducing the dimension of the training, labelled,
dataset. The images selected from the dataset
were distributed in three subsets: 800 for train-
ing, 200 for validation and 100 for test. A scheme
Guidelines for Effective Automatic Multiple Sclerosis Lesion Segmentation by Magnetic Resonance Imaging
573