It was shown in (Coifman and Lafon, 2006) that any
positive semi-definite kernel may be used for the di-
mensionality reduction. Rigorous analysis of families
of kernels to facilitate the derivation of an optimal ker-
nel for a given set Γ is an open problem.
The parameter η (δ) determines the dimensional-
ity of the diffusion space. A rigorous method for
choosing η(δ) will facilitate an automatic embedding
of the data. Naturally, η (δ) is data driven (similarly
to ε) i.e. it depends on the set Γ at hand.
Finally, various applications of the diffusion bases
scheme are currently being investigated by the authors
- namely, video segmentation and construction of en-
sembles of classifiers.
The choice of ε is critical to achieve the optimal per-
formance of the DM and DB algorithms since it de-
fines the size of the local neighborhood of each point.
On one hand, a large ε produces a coarse analysis
of the data as the neighborhood of each point will
contain a large number of points. In this case, the
similarity weight will be close to one for most pairs
of points. On the other hand, a small ε might pro-
duce neighborhoods that contain only one point. In
this case, the similarity will be zero for most pairs of
points. Clearly, an adequate choice of ε lies between
these two extreme cases and should be derived from
the data.
In the following, we derive the range from which ε
should be chosen when a Gaussian weight function is
used and when the dataset Γ approximately lies near a
low dimensional manifold. We denote by d the intrin-
sic dimension of M. Let L = I − P = I − D
W be the
normalized graph Laplacian (Chung, 1997) where P
was defined in Eq. (4) and I is the identity matrix.
The matrices L and P share the same eigenvectors.
Furthermore, Singer (2006) proved that if the points
in Γ are independently uniformly distributed over M
then with high probability
i j
f (x
) =
f (x
) + O
where f : M → R is a smooth function and 4
is the
continuous Laplace-Beltrami operator of the manifold
Diffusion Bases Dimensionality Reduction