8 CONCLUSIONS
The proposed and developed work aims at
contributing to a more simplified, productive and
effective exploration of WUM potentialities. The
practice shows that often it is more efficient to solve
a problem starting from a tested successful solution
of a previous similar situation, than to generate the
entire solution from scratch. This fact is particularly
truth in the DM and WUM domains, where recurrent
problems are quite common. To achieve this aim, we
implemented a system, essentially founded on the
CBR paradigm, which should suggest the more
suited mining plans to one clickstream data analysis
problem, given its high level description.
In this paper we described the similarity
assessment approach, followed within the retrieval
process, in order to cope with the multi-relational
case representation. Structured representation and
similarity assessment over complex data are
important issues to a growing variety of application
domains. It is a known fact that there is a trade-off
between the expressiveness of the representation
languages and the efficiency (complexity) of the
learning method. The strategy of extending distance-
based propositional methods through structured and
typed representations, able to simplify the problem
modelling, and treating the features and theirs
properties in the similarity measures is
advantageous. It is simple, enables to benefice from
the research and the efficiency from these methods,
exploring at the same time the greater
expressiveness of such representations. Since this
strategy is suited to our current demands, it was
adopted to handle the faced issues.
We considered specifically the issue of
measuring the similarity between sets of elements.
There are multiple proposals in the literature to deal
with this issue, but an ideal and general approach,
appropriate to several purposes such as the intended
semantic and properties, does not exist.
Consequently, we explored a number of different
already defined similarity measures and we extended
one of them to better fit our purposes. This extension
gave raise to two measures suited to the similarity
assessment of features with different properties.
We are currently working on the construction of
more cases, comprising WUM process with higher
complexity. Afterward, a more detailed and
systematic experimental evaluation of the system is
necessary. Moreover, one future direction of work
concerns the weights assignment improvement,
based on a comprehensive evaluation of the features
relevance and discriminating power.
ACKNOWLEDGEMENTS
The work of Cristina Wanzeller has been supported
by a grant from PRODEP (Acção 5.3, concurso nº02
/2003).
REFERENCES
Bergmann, R., 2001. Highlights of the European INRECA
projects. In ICCBR’01, 4th International Conference
on CBR, Springer-Verlag, 1-15.
Bergmann, R., Stahl, A., 1998. Similarity Measures for
Object-Oriented Case Representations. In EWCBR'98,
4th European Workshop on Case-Based Reasoning.
Springer-Verlag, Vol. 1488, 25-36.
Bohnebeck, U., Horváth, T., Wrobel, S., 1998. Term
Comparisons in First-Order Similarity Measures. In
8th International Conference on Inductive Logic
Programming, Vol. 1446, Springer-Verlag, 65-79.
Duda, R., Hart, P., Stork, D., 2001. Pattern Classification
and Scene Analysis, chapter Unsupervised Learning
and Clustering. John Willey and Sons.
Eiter, T., Mannila, H., 1997. Distance Measures for Point
Sets and their Computation. Acta Informatica, 34(2),
109–133.
Emde, W., Wettschereck, D., 1996. Relational Instance-
based Learning. In 13th International Conf. on
Machine Learning, Morgan Kaufmann, 122-130.
Gregori, V., Ramírez C., Orallo, J., Quintana, M., 2005. A
survey of (pseudo-distance) Functions for Structured-
Data. In TAMIDA’05, III Taller Nacional de Minería
de Datos y Aprendizaje, Editorial Thomson,
CEDI’2005, 233-242.
Flach, P., Giraud-Carrier, C., Lloyd, J., 1998. Strongly
Typed Inductive Concept Learning. In 8th
International Workshop on Inductive Logic
Programming, Springer-Verlag, Vol. 1446, 185-194.
Hilario, M., Kalousis, A., 2003. Representational Issues in
Meta-Learning. In ICML'03, 20th International Conf.
on Machine Learning , AAAI Press, 313-320.
Kirsten, M., Wrobel, S., 1998. Relational Distance Based
Clustering. In 8th Int. Conf. on Inductive Logic
Programming, Vol. 1446, Springer-Verlag, 261-270.
Kirsten, M., Wrobel, S., Horvath, T., 2001. Relational
Data Mining. Distance Based Approaches to
Relational Learning and Clustering, Springer-Verlag,
212-232.
Kolodner, J., 1993. Case Based Reasoning. Morgan
Kaufmann, San Francisco, CA.
Ramon, J., 2002. Clustering and Instance Based Learning
in First Order logic. PhD thesis, K.U. Leuven,
Belgium.
Wanzeller, C., Belo, O., 2006. Selecting Clickstream Data
Mining Plans Using a Case-Based Reasoning
Application. In DMIE’06, 7th International
Conference on Data, Text and Web Mining and their
Business Applications and Management Information
Engineering, 223-232.
ICEIS 2007 - International Conference on Enterprise Information Systems
144