
 
Finally, a kind of analysis about the “additive” 
effect of conditions in patient’s death has been done. 
For that, the first of the rules shown in Table 3 has 
been chosen from the rules generated previously. It 
associates alcohol consumption, tobacco 
consumption and systolic blood pressure with 
patient’s death. This rule expresses that “65% of the 
patients with an alcohol consumption in [1.12, 
1.69], smoking more than 20 cigarettes/day and with 
a systolic blood pressure in [140, 220], were dead”. 
To compare the effect of those conditions, alone 
and in pairs, rules having the desired conditions have 
been selected, and their quality measures are shown 
in Table 3. 
An analysis of the rules indicates that although 
the condition associated to alcohol consumption is 
less correlated to death (with a lift value of 1) than 
the other two conditions evaluated, when added to 
the combination of tobacco consumption and blood 
pressure, it increases the confidence from 0.56 to 
0.65. 
5 CONCLUSIONS 
In this work, medical data from an atherosclerosis 
study has been used to extract association rules from 
it. 
Association rules can express unknown 
knowledge present in data, in the form of 
relationships between the values of the variables. 
The method employed is based on a 
deterministic approach that generates association 
rules without a previous discretization of the 
numerical attributes. Discretization can notably 
affect the quality of the rules generated, and it is 
usually difficult to know the best discretization 
technique to apply it to a deterministic algorithm for 
a particular dataset. 
A variety of rules has been obtained, with good 
values of their quality measures, what seems to 
support the method employed as a valid way to 
generate association rules without a previous 
discretization of the numerical attributes. 
Also, a particular analysis of a selected rule has 
been performed. The rule associates some conditions 
with the death of patients object of the study. 
ACKNOWLEDGEMENTS 
This work was partially funded by the Spanish 
Ministry of Science and Innovation, the Spanish 
Government Plan E and the European Union through 
ERDF (TIN2009-14057-C03-03). 
REFERENCES 
Agrawal, R., Imielinski, T., Swami, A., 1993. Mining 
Association Rules between Sets of Items in Large 
Databases. In ACM SIGMOD ICMD, pp. 207-216. 
ACM Press. 
Bodon, F., 2005. A Trie-based APRIORI Implementation 
for Mining Frequent Item Sequences. In 1st 
International Workshop on Open Source Data Mining: 
Frequent Pattern Mining Implementations, Chicago, 
Illinois, pp. 56–65. ACM Press. 
Borgelt, C., 2003. Efficient Implementations of Apriori 
and Eclat. In Workshop on Frequent Itemset Mining 
Implementations. CEUR Workshop Proc. 90, Florida. 
Boudík, F., Tomečková, M., Bultas, J., 2004. STULONG 
medical project. http://euromise.vse.cz/challenge2004. 
Prague. 
Brin, S., Motwani, R., Ullman, J.D., Tsur, S., 1997. 
Dynamic Itemset Counting and Implication Rules for 
Market Basket Data. In Proc. of the ACM SIGMOD 
1997, pp. 265-276. 
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., 1996. From 
Data Mining to Knowledge Discovery in Databases. 
AI Magazine, Vol. 17, pp. 37-54. 
Han, J., Kamber, M., 2006. Data Mining: Concepts and 
Techniques. Morgan Kaufmann, San Francisco. 
Lee, C.-H., 2007. A Hellinger-based Discretization 
Method for Numeric Attributes in Classification 
Learning. Knowledge-Based Systems, 20(4), 419-425. 
Liu, H., Hussain, F., Tan, C., Dash, M., 2002. 
Discretization: An Enabling Technique. Data Mining 
and Knowledge Discovery, 6(4), 393-423. 
Salleb, A., Turmeaux, T., Vrain, C., Nortet, C., 2004. 
Mining Quantitative Association Rules in a 
Atherosclerosis Dataset. Contribution to the PKDD 
Discovery Challenge 2004,  http://www.univ-
orleans.fr/lifo/Members/salleb/Challenge2004. 
Srikant, R., Agrawal, R., 1996. Mining Quantitative 
Association Rules in Large Relational Tables. In Proc. 
of the ACM SIGMOD 1996, pp. 1-12. 
Tsai, C.-J., Lee, C.-I., Yang, W.-P., 2008. A Discretization 
Algorithm Based on Class-Attribute Contingency 
Coefficient. Information Science, 178(3), 714-731. 
HEALTHINF 2012 - International Conference on Health Informatics
400