set under analysis, together with a conjunctive char-
acterization of each cluster. Distances for comparing
distributions are considered. Experiments with real-
data allowing for comparision with alternative meth-
ods are planned.
The following step should consist of implementing
a procedure for revising the condition inducing the
cluster chosen for splitting at each step, so as to im-
prove the obtained clustering (in the same line as in
(Chavent, 1998)). Also, the hierarchy may be in-
dexed, so that a dendrogram is obtained. Finally, the
complexity of the computation of the intra-cluster dis-
persion may be reduced by taking into account the or-
der between the cut values I
j`
. These developments
will be the subject of our further research.
REFERENCES
Billard, L. and Diday, E. (2006). Symbolic Data Analysis:
Conceptual Statistics and Data Mining. Wiley.
Bock, H.-H. and Diday, E. (2000). Analysis of Symbolic
Data. Springer, Berlin-Heidelberg.
Boley, D. L. (1998). Principal direction divisive parti-
tioning. Data Mining and Knowledge Discovery,
2(4):325–344.
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone,
C. J. (1984). Classification and Regression Trees.
Wadsworth, Belmont, CA.
Brito, P. (1994). Use of pyramids in symbolic data analysis.
In Diday, E. et al., editors, New Approaches in Clas-
sification and Data Analysis, pages 378–386, Berlin-
Heidelberg. Springer.
Brito, P. (1995). Symbolic objects: order structure and
pyramidal clustering. Annals of Operations Research,
55:277–297.
Brito, P. (1998). Symbolic clustering of probabilistic
data. In Rizzi, A. et al., editors, Advances in Data
Science and Classification, pages 385–389, Berlin-
Heidelberg. Springer.
Chavent, M. (1998). A monothetic clustering method. Pat-
tern Recognition Letters, 19(11):989–996.
Chavent, M. (2000). Criterion-based divisive clustering
for symbolic objects. In Bock, H.-H. and Diday, E.,
editors, Analysis of Symbolic Data, pages 299–311,
Berlin-Heidelberg. Springer.
Chavent, M., De Carvalho, F. A. T., Lechevallier, Y., and
Verde, R. (2006). New clustering methods for interval
data. Computational Statistics, 21(2):211–229.
Chavent, M., Lechevallier, Y., and Briant, O. (2007).
DIVCLUS-T: A monothetic divisive hierarchical
clustering method. CSDA, 52(2):687–701.
Ciampi, A. (1994). Classification and discrimination: the
RECPAM approach. In Dutter, R. and Grossmann, W.,
editors, Proc. COMPSTAT’94, pages 129–147. Phys-
ica Verlag.
De Carvalho, F. A. T., Brito, P., and Bock, H.-H. (2006).
Dynamic clustering for interval data based on L
2
dis-
tance. Computational Statistics, 21(2):231–250.
De Carvalho, F. A. T., Csernel, M., and Lechevallier, Y.
(2009). Clustering constrained symbolic data. Pattern
Recognition Letters, 30(11):1037–1045.
De Carvalho, F. A. T. and De Souza, R. M. C. R. (2010).
Unsupervised pattern recognition models for mixed
feature-type symbolic data. Pattern Recognition Let-
ters, 31(5):430–443.
De Souza, R. M. C. R. and De Carvalho, F. A. T. (2004).
Clustering of interval data based on city-block dis-
tances. Pattern Recognition Letters, 25(3):353–365.
Dhillon, I. S., Mallela, S., and Kumar, R. (2003). A divisive
information-theoretic feature clustering algorithm for
text classification. Journal of Machine Learning Re-
search, 3:1265–1287.
Diday, E. and Noirhomme-Fraiture, M. (2008). Symbolic
Data Analysis and the Sodas Software. Wiley.
Fang, H. and Saad, Y. (2008). Farthest centroids divisive
clustering. In Proc. ICMLA, pages 232–238.
Gowda, K. C. and Krishna, G. (1978). Disaggregative clus-
tering using the concept of mutual nearest neighbor-
hood. IEEE Trans. SMC, 8:888–895.
Hardy, A. and Baune, J. (2007). Clustering and validation of
interval data. In Brito, P. et al., editors, Selected Con-
tributions in Data Analysis and Classification, pages
69–82, Heidelberg. Springer.
Hardy, A. and Kasaro, N. (2009). A new clustering method
for interval data. MSH/MSS, 187:79–91.
Irpino, A. and Verde, R. (2006). A new Wasserstein based
distance for the hierarchical clustering of histogram
symbolic data. In Batagelj, V. et al., editors, Proc.
IFCS 2006, pages 185–192, Heidelberg. Springer.
Kaufman, L. and Rousseeuw, P. J. (1990). Finding Groups
in Data. Wiley, New York.
Lance, G. N. and Williams, W. T. (1968). Note on a new
information statistic classification program. The Com-
puter Journal, 11:195–197.
MacNaughton-Smith, P. (1964). Dissimilarity analysis: A
new technique of hierarchical subdivision. Nature,
202:1034–1035.
Michalski, R. S., Diday, E., and Stepp, R. (1981). A
recent advance in data analysis: Clustering objects
into classes characterized by conjunctive concepts. In
Kanal, L. N. and Rosenfeld, A., editors, Progress in
Pattern Recognition, pages 33–56. Springer.
Michalski, R. S. and Stepp, R. (1983). Learning from ob-
servations: Conceptual clustering. In Michalsky, R. S.
et al., editors, Machine Learning: An Artificial Intelli-
gence Approach, pages 163–190. Morgan Kaufmann.
Noirhomme-Fraiture, M. and Brito, P. (2011). Far beyond
the classical data models: Symbolic data analysis. Sta-
tistical Analysis and Data Mining, 4(2):157–170.
Quinlan, J. R. (1986). Induction of decision trees. Machine
Learning, 1:81–106.
Sneath, P. H. and Sokal, R. R. (1973). Numerical Taxonomy.
Freeman, San Francisco.
Williams, W. T. and Lambert, J. M. (1959). Multivariate
methods in plant ecology. J. Ecology, 47:83–101.
ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods
234