AN ONTOLOGY DRIVEN DATA MINING PROCESS
Laurent Brisson, Martine Collard
2008
Abstract
This paper deals with knowledge integration in a data mining process. We suggest to model domain knowledge during business understanding and data understanding steps in order to build an ontology driven information system (ODIS). We present the KEOPS Methodology based on this approach. In KEOPS, the ODIS is dedicated to data mining tasks. It allows using expert knowledge for efficient data selection, data preparation and model interpretation. In this paper, we detail each of these ontology driven steps and we define a part-way interestingness measure that integrates both objective and subjective criteria in order to evaluate model relevance according to expert knowledge.
References
- Becker, H. S. (1976). Sociological Work: Method and Substance. Transaction Publishers, U. S.
- Berka, P. and Bruha, I. (1998). Discretization and grouping: Preprocessing steps for data mining. In PKDD, pages 239-245.
- Brisson, L. (2007). Knowledge extraction using a conceptual information system (excis). In Ontologies-Based Databases and Information Systems, volume 4623 of Lecture notes in computer science, pages 119 - 134, Berlin, Heidelberg. Springer.
- Ceri, S. and Fraternali, P. (1997). Designing Database Applications with Objects and Rules: The IDEA Methodology. Series on Database Systems and Applications. Addison Wesley.
- Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., and Wirth, R. (2000). Crisp-dm 1.0: Step-by-step data mining guide. In SPSS Inc.
- De Leenheer, P. and de Moor, A. (2005). Context-driven disambiguation in ontology elicitation. In Shvaiko, P. and Euzenat, J., editors, Context and Ontologies: Theory, Practice and Applications, pages 17-24, Pittsburgh, Pennsylvania. AAAI, AAAI Press.
- Guarino, N. (1998). Formal Ontology in Information Systems. IOS Press, Amsterdam, The Netherlands. Amended version of previous one in Proceedings of the 1st International Conference June 6-8, 1998, Trento, Italy.
- Guarino, N., Masolo, C., and Vetere, G. (1998). Ontoseek: Using large linguistic ontologies for gathering information resources from the web. Technical report, LADSEB-CNR.
- Kedad, Z. and Métais, E. (2002). Ontology-based data cleaning. In NLDB 7802: Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers, pages 137-149, London, UK. Springer-Verlag.
- McGarry, K. (2005). A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev., 20(1):39- 61.
- Srikant, R. and Agrawal, R. (1995). Mining generalized association rules. In VLDB 7895: Proceedings of the 21th International Conference on Very Large Data Bases, pages 407-419, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
Paper Citation
in Harvard Style
Brisson L. and Collard M. (2008). AN ONTOLOGY DRIVEN DATA MINING PROCESS . In Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8111-37-1, pages 54-61. DOI: 10.5220/0001697400540061
in Bibtex Style
@conference{iceis08,
author={Laurent Brisson and Martine Collard},
title={AN ONTOLOGY DRIVEN DATA MINING PROCESS},
booktitle={Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2008},
pages={54-61},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001697400540061},
isbn={978-989-8111-37-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Tenth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - AN ONTOLOGY DRIVEN DATA MINING PROCESS
SN - 978-989-8111-37-1
AU - Brisson L.
AU - Collard M.
PY - 2008
SP - 54
EP - 61
DO - 10.5220/0001697400540061