nant women, more easily compared to conventional
allopathic techniques. Pulse based diagnosis also has
the advantage of being inexpensive, non-invasive,and
painless. The practitioners ‘feel’ for a certain pattern
in the pulse which forms the basis of their diagno-
sis. This technique requires a high degree of exper-
tise. The paucity of expert Ayurvedic practitioners
has limited the more widespread use and popularity
of the Ayurvedic technique.
The goal of this work is to develop a computer
aided diagnosis (CAD) system for the automated, re-
producible analysis of Ayurvedic pulse signals. To
the best of our knowledge, this work represents the
first attempt at CAD of Ayurvedic pulse signals.
Since the Ayurvedic pulse signal is a time series,
pattern recognition methods that have been previously
applied to analysis of other time series data (ECG,
EMG) might seem appropriate for CAD of PBD.
Pattern recognition in electrocardiogram (ECG) has
been applied to QRS/PVC recognition and classifica-
tion, the recognition of ischemic beats and episodes,
and the detection of atrial fibrillation using nonlin-
ear transformations and neural networks (Maglav-
eras et al., 1998). In (Maglaveras et al., 1998), a
model-based approach for classification of ECG stud-
ies based on previously defined signatures of normal
and diseased ECG signals was employed. Given that
this is a first CAD attempt at Ayurvedic PBD, quan-
titative signatures for normal and diseased patterns
haveyet to be studied and modelled. We consequently
explore a domain independent scheme for classifica-
tion of the pulse data via dimensionality reduction.
Dimensionality reduction (DR), is a transforma-
tion of the original high-dimensional feature space
to a space of eigenvectors which are capable of de-
scribing the data in far fewer dimensions. DR also
permits the visualization of individual data classes
and identification of possible subclasses within the
high dimensional data. The most popular method for
DR is Principal Component Analysis (PCA) which
attempts to find orthogonal eigenvectors accounting
for the greatest amount of variability in the data.
PCA assumes that the data is linear and the embed-
ded eigenvectors represent low-dimensional projec-
tions of linear relationships between data points in
high-dimensional space. However, our previous re-
search has strongly suggested that biomedical data
is highly nonlinear in nature (Lee et al., 2008) and
that nonlinear DR schemes such as Isometric Map-
ping (Isomap) (Tenenbaum et al., 2000), Locally Lin-
ear Embedding (LLE), (Roweis and Saul, 2000) are
more appropriate for projection and subsequent clas-
sification of high-dimensional data including protein,
gene-expression, and spectroscopic data. LLE and
Isomap assume that the high dimensional data on a
high-order curve that is highly nonlinear and hence
object distances measured on this nonlinear manifold
should be geodesic as opposed to Euclidean. Nonlin-
ear methods attempt to map data along this nonlinear
manifold by assuming only neighboring points (deter-
mined via geodesic proximity) to be similar enough to
be mapped linearly with minimal error. The nonlinear
manifold can then be reconstructed based on these lo-
cally linear assumptions.
NLDR schemes like LLE determine the neighbor-
ing locations on the manifold (via Euclidean distance)
and map the neighborhood associated with each ob-
ject into the reduced dimensional embedding space.
The size of the local neighborhood within which LLE
assumes local linearity is however determined by a
free parameter κ. Optimal estimation of κ is still an
open problem. In (Tiwari et al., 2008) a new NLDR
scheme, C-LLE, that is able to handle the limitations
of LLE by avoiding the κ estimation, focusing instead
on optimally determining pairwise object distances in
the low dimensional embeddingspace, was presented.
Another limitation of LLE is that it traditionally
employs the Euclidean distance measure to deter-
mine neighbors within local patches on the manifold.
While the Euclidean distance measure is appropriate
for measuring the distance between objects character-
ized by discrete attributes, it is less appropriate for
measuring pulse signal similarity. Non-Euclidean dis-
tance measures such as mutual information (MI), en-
tropy correlation coefficient (ECC), and the relative
entropy (RE) have been shown to be more appropri-
ate for measuring the similarity between signals com-
pared to the L2 norm (Tononi et al., 1996). In this
paper, the C-LLE algorithm is employed in conjunc-
tion with such signal similarity measures as MI, ECC,
and RE to embed Ayurvedic pulse signals in a lower-
dimensional embedding space. Prior to embedding,
the pulse signals are first aligned with respect to each
other so that the pulse peaks for the different studies
are in concordance. A consensus clustering (Strehl
and Ghosh, 2002) algorithm is then employed to dis-
criminate between the normal and diseased pulse sig-
nals in the lower dimensional embedding space. The
major contributions of this work are:
• To the best of our knowledge, this is the first
CAD system for classification and analysis of tra-
ditional Indian Ayurvedic pulse medicine.
• Our CAD approach employs the C-LLE algo-
rithm that we have previously shown to outper-
form LLE, Isomap.
• We introduce the use of non-Euclidean distance
measures (MI, ECC, RE) geared specifically
to determining signal similarity for identifying
CLASSIFYING AYURVEDIC PULSE SIGNALS VIA CONSENSUS LOCALLY LINEAR EMBEDDING
389