RULE EXTRACTION FROM MEDICAL DATA WITHOUT DISCRETIZATION OF NUMERICAL ATTRIBUTES

Juan L. Domínguez-Olmedo, Jacinto Mata, Victoria Pachón, Manuel J. Maña

Abstract

Association rule mining is a popular technique used to find associations between attributes in a dataset. When using deterministic algorithms, if the attributes have numerical values the usual approach is to discretize them defining proper intervals. But the discretization can notably affect the quality of the rules generated. This work presents a method based on a deterministic exploration of the interval search space without a previous discretization of the numerical attributes. It has been applied to medical data from an atherosclerosis study. The quality of the obtained rules seems to support this method as a valid alternative for this kind of rule extraction.

References

  1. Agrawal, R., Imielinski, T., Swami, A., 1993. Mining Association Rules between Sets of Items in Large Databases. In ACM SIGMOD ICMD, pp. 207-216. ACM Press.
  2. Bodon, F., 2005. A Trie-based APRIORI Implementation for Mining Frequent Item Sequences. In 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, Chicago, Illinois, pp. 56-65. ACM Press.
  3. Borgelt, C., 2003. Efficient Implementations of Apriori and Eclat. In Workshop on Frequent Itemset Mining Implementations. CEUR Workshop Proc. 90, Florida.
  4. Boudík, F., Tomecková, M., Bultas, J., 2004. STULONG medical project. http://euromise.vse.cz/challenge2004. Prague.
  5. Brin, S., Motwani, R., Ullman, J.D., Tsur, S., 1997. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the ACM SIGMOD 1997, pp. 265-276.
  6. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., 1996. From Data Mining to Knowledge Discovery in Databases. AI Magazine, Vol. 17, pp. 37-54.
  7. Han, J., Kamber, M., 2006. Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco.
  8. Lee, C.-H., 2007. A Hellinger-based Discretization Method for Numeric Attributes in Classification Learning. Knowledge-Based Systems, 20(4), 419-425.
  9. Liu, H., Hussain, F., Tan, C., Dash, M., 2002. Discretization: An Enabling Technique. Data Mining and Knowledge Discovery, 6(4), 393-423.
  10. Salleb, A., Turmeaux, T., Vrain, C., Nortet, C., 2004. Mining Quantitative Association Rules in a Atherosclerosis Dataset. Contribution to the PKDD Discovery Challenge 2004, http://www.univorleans.fr/lifo/Members/salleb/Challenge2004.
  11. Srikant, R., Agrawal, R., 1996. Mining Quantitative Association Rules in Large Relational Tables. In Proc. of the ACM SIGMOD 1996, pp. 1-12.
  12. Tsai, C.-J., Lee, C.-I., Yang, W.-P., 2008. A Discretization Algorithm Based on Class-Attribute Contingency Coefficient. Information Science, 178(3), 714-731.
Download


Paper Citation


in Harvard Style

L. Domínguez-Olmedo J., Mata J., Pachón V. and J. Maña M. (2012). RULE EXTRACTION FROM MEDICAL DATA WITHOUT DISCRETIZATION OF NUMERICAL ATTRIBUTES . In Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2012) ISBN 978-989-8425-88-1, pages 397-400. DOI: 10.5220/0003784603970400


in Bibtex Style

@conference{healthinf12,
author={Juan L. Domínguez-Olmedo and Jacinto Mata and Victoria Pachón and Manuel J. Maña},
title={RULE EXTRACTION FROM MEDICAL DATA WITHOUT DISCRETIZATION OF NUMERICAL ATTRIBUTES},
booktitle={Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2012)},
year={2012},
pages={397-400},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003784603970400},
isbn={978-989-8425-88-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2012)
TI - RULE EXTRACTION FROM MEDICAL DATA WITHOUT DISCRETIZATION OF NUMERICAL ATTRIBUTES
SN - 978-989-8425-88-1
AU - L. Domínguez-Olmedo J.
AU - Mata J.
AU - Pachón V.
AU - J. Maña M.
PY - 2012
SP - 397
EP - 400
DO - 10.5220/0003784603970400