The weight of the most similar patch is one and the
weight gradually decreases in accordance with the
distance. The weight of the K-th patch is zero.
2.3 Information Integration
In this section, we explain the integration of local
and global information. The probabilities of local
and global information are calculated in every pixel.
Since we assume that the local and global
information is independence, we integrate both
probabilities by the product. Therefore, it is defined
as
∙
(5)
where
,
and
express the
probability of the -th class for the -th pixel in an
image. After integration, the class label
of the -th
pixel is defined as
argmax
.
(6)
3 EXPERIMENTS
We show the experimental results of our method. In
section 3.1, we describe the MSRC21 dataset
(Shotton et al., 2006) and how to evaluate the
accuracy. We show preliminary experiment in
section 3.2. In section 3.3, we show the accuracy of
our method. We show the comparison with related
works in section 3.4.
3.1 How to Evaluate Accuracy
We use the MSRC21 dataset (Shotton et al., 2006) in
the following experiments. The dataset has 591
images and contains 21 classes (building, grass, tree,
cow, sheep, sky, aeroplane, water, face, car, bicycle,
flower, sign, bird, book, chair, road, cat, dog, body,
boat). In this paper, we use 276 images for training,
59 images for validation, and 256 images for testing.
Therefore, the number of test patches is 1792 (= 256
× 7), the number of training patches is 1932 (= 276 ×
7) and the number of validation patches is 413 (= 59
× 7).
We use pixel-wise accuracy and class average
accuracy for evaluation. Class average accuracy is
the average percent of correctly labeled pixel in each
class. Pixel-wise accuracy is the percent of correctly
labeled pixels in total. Since the number of pixels in
each class is different, the two accuracies become
different value.
3.2 Preliminary Experiment for Global
Information
We show class average and pixel-wise accuracy
after integration in Figure 4. These accuracies were
obtained using validation images. After K = 40,
class average and pixel-wise accuracy are almost
unchanged. The larger K is, the higher
computational cost is. Therefore, we set K = 40 in
the following experiments.
Figure 4: Class average and pixel-wise accuracy (%) after
integration. The value of K in K-NN is different.
3.3 Results on the MSRC21 Dataset
Table 1 shows the accuracies of our method. We see
that integration of local and global information is
effective for image labeling. The accuracy of cow
class is 48.6% by only local information, while the
accuracy is 57.9% by only global information. In the
case of building, the accuracy is 41.1% by only local
information and the accuracy is 33.4% by only
global information. Thus, local and global
information have a complementary relationship each
other, and the accuracy of our method is improved
by using local and global information.
We show the examples of labeling results in
Figure 5. As shown in Figure 5, chair, boat and bird
are not labeled well. Common points of those classes
are that within-class variance is large and the
number of training sample is small. Boat class in the
MSRC21 dataset includes various ships, e.g. small
craft or large passenger ship. Chair class includes
various kinds of chairs, e.g. plastic chair or wooden
chair. It is considered that the class with large
within-class variance could not be characterized well
by using local color and texture feature. The label by
global information is estimated by voting the
ground-truth label which is attached to the similar
patches. Hence, the vote of the class with a small
number of training samples decreases and the class
is not easily classified. For example, the grass
classified with high accuracy has 2,574,052 pixels in
ImageLabelingbyIntegratingGlobalInformationby7PatchesandLocalInformation
539