some space can be done by transforming the prob-
lem into a new space in such a way that the Euclidean
distance in the new space is equal to the distance mea-
sured on the data manifold in the original space.
Diffusion clustering is based on the idea to use the
coordinates introduced in Eq. (6) for clustering the
graph nodes, i.e. the image pixels (Nadler et al., 2005;
Lafon and Lee, 2006; Huang et al., 2011). The time
t can be used to set the level of details that is desired.
Often, the k-means clustering method is mentioned
in this context (Lafon and Lee, 2006; Huang et al.,
2011). It is believed that much less than n coordinates
are needed in practice.
3 THE PROBLEMS OF
DIFFUSION DISTANCE
In this section, we show that the diffusion distance
has the properties that make it difficult to use it for
image segmentation. We show that the value of diffu-
sion distance between two image points does not nec-
essarily give a good clue whether or not they belong
to one image segment. We note that a certain criti-
cism in a similar sense has already been published for
the commute-time distance. In (von Luxburg et al.,
2014), the authors came to the conclusion that the
commute-time distance in graph does not reflect its
structure correctly if the graph is large. We continue
in this direction and show some further problems that
are relevant for image segmentation. We also show
that the problems appear not only for the commute-
time (resistance) distance, but also for the diffusion
distance, i.e. they cannot be avoided by a certain suit-
able choice of time.
Consider two points, denoted by p, q, in image.
We study two situations (Fig. 1): (i) Both the points
are placed in an image containing one rectangular
area with a constant brightness; the size of image is
w ×h pixels. (ii) The size of image is w ×h pixels
again, but the image area is now split by the vertical
line into two halves (areas); the brightness is constant
inside each area; the difference of brightness between
the areas is equal to 1; each of the points is placed
in one area. The Euclidean distance between p and q
measured in the xy plane is denoted by a (Fig. 1). We
traditionally call these situations as ”without edge”
and ”with edge”, respectively. Clearly, from the point
of view of image segmentation, these two situations
are substantially different. In the second case, we ex-
pect two image segments and a big distance between
p and q. In the first case, only one image segment and
a small distance between p and q are expected.
A simple theoretical consideration might be useful
Figure 1: Two points (p,q) placed into an image containing
a single area (left image) or two areas (right image).
30x11 30x21 30x31 30x41 30x51
Area Size [px]
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
Diffusion Distance, t= 25
With Edge σ=0.4
With Edge σ=0.5
With Edge σ=0.6
With Edge σ=0.7
Without Edge σ=0.5
30x11 30x21 30x31 30x41 30x51
Area Size [px]
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
Diffusion Distance, t= 100
With Edge σ=0.4
With Edge σ=0.5
With Edge σ=0.6
With Edge σ=0.7
Without Edge σ=0.5
Figure 2: The dependence of diffusion distance on the
length of the edge between the areas: The distance (vertical
axis) is computed for the problem from Fig. 1 with/without
the edge, for a = 15, and for various values of t, σ, and for
the increasing value of h (the length of the edge between the
areas); the width of the areas remains constant (the value of
w). It can be seen that for one value of t and σ, the value of
distance depends on h.
for obtaining the first intuitive overview. We compute
the distance d
t
(p,q) by making use of the formula
from Eq. (8) for both mentioned cases. If we consider
all possible sizes of image (from small to infinitely
big) and all possible values of time (0 ≤ t < ∞), we
can easily see that the values of distance vary between
0 and
√
2 in both cases. (We note that the value of
√
2
is the distance between every two distinct points for
t = 0.) It follows that it is threatening that from the
value of diffusion distance itself, it will not be clear
whether it was obtained for the case (i) or (ii).
For a more detailed insight, we present the com-
putational simulation of the problem (Fig. 1). Var-
ious image sizes, values of time, and various values
of σ (Eq. (3)) are considered. The results show that
the diffusion distance presented in Figs. 2, 3, and 4
between p and q depends on the length of the edge
between the areas (Fig. 2), on the size of areas (Fig.
3), and on the distance of points in the xy plane (Fig.
4). Special attention should be paid to the fact that,
for some area sizes, it may happen that the diffusion
distance between the points lying in one area (case
(i)) is greater than in the case if the points lie in two
areas (case (ii)) . In Fig. 2, for example, we can
see that for t = 100 and σ = 0.5, the distance for
(w = 30, h = 11) in the case (i) is greater than the dis-
tance for (w = 30, h = 31) in the case (ii). As can be
seen, the problem increases with the increasing value
of σ. We note that the value of σ must be big enough
with respect to the noise intensity that is expected.
In image segmentation, the neighbouring seg-
ments may be of different sizes, which has not been
NormalisedDiffusionCosineSimilarityandItsUseforImageSegmentation
123