association rules can find all the less complex pre-
dictive rules from a data set given a proper setting
of the parameters. These results indicate that while
GEARM and GEADT can both detect gene-gene in-
teractions. GEARM can do it more efficiently and has
higher power to detect two-locus interactions under
either definition of power.
In spite of the good results GEARM has yielded,
the approach is still under study to improve its perfor-
mance. More tests will be performed with different
parameter sizes. We are also assessing an approach
for rule pruning to generate better results. As such,
we aim, on the one hand, to achieve an even better
prediction accuracy and more power in the detection
of epistasis and, on the other hand, compare our re-
sults with other successful approaches in genetic epi-
demiology for simulated and real data.
4 CONCLUSIONS
In this paper we have presented a new approach that
uses Grammatical Evolution to discover a set of asso-
ciation rules. GEARM provides an efficient mecha-
nism for the classification of individuals and the de-
tection of gene-gene interactions in the presence or
absence of main effects. It has been tested on simu-
lated data set with different models. Our proposal has
yielded a reduced set of association rules. Also, with
this small association rule set, we have managed to
cover all the SNPs in the dataset.
In spite of the good results we have obtained,
the approach is still under study and our work is
in progress to improve its performance. We aim to
achieve more power in the detection of epistasis, ap-
ply it on real data and compare the results it yields
with other successful approaches in genetic epidemi-
ology. We expect that GEARM can do so more effi-
ciently than other techniques. We thus see GEARM
as a promising new approach for human genetics.
REFERENCES
Agrawal, R. and Srikant, R. (1994). Fast algorithms for
mining association rules in large databases. 20th Inter-
national Conference on Very Large Data Bases, Santi-
ago, Chile.Morgan Kaufmann ISBN 1-55860-153-8.
Creighton, C. and Hanash, S. (2003). Mining gene expres-
sion databases for association rules. Bioinformatics
19(1): 79-86.
Espejo, P., Ventura, S., and Herrera, F. (2010). A survey
on the application of genetic programming to classi-
fication. IEEE Transactions on Systems, Man, and
Cybernetics, vol. 40, no. 2, pp. 121-144.
He, H., Oetting, W., Brott, M., and Basu, S. (2009). Power
of multifactor dimensionality reduction and penalized
logistic regression for detecting gene-gene interaction
in a case-control study. BMC Med Genet, 10:127.
Holzinger, E., Buchanan, C., Dudek, S., Torstenson, E.,
Turner, S., and Ritchie, M. (2010). Initialization pa-
rameter sweep in athena: Optimizing neural networks
for detecting gene interactions in the presence of small
main effects. Genetic and Evolutionary Computation
Conference, 12:203-210.
Koo, C., Liew, M., Mohamad, M., and Salleh, A. (2013).
A review for detecting gene-gene interactions using
machine learning methods in genetic epidemiology.
BioMed Research International, Article ID 432375,
13 pages, 2013. doi:10.1155/2013/432375.
Lehr, T., Yuan, J., Zeumer, D., Jayadev, S., and Ritchie, M.
(2011). Rule-based classifier for the analysis of gene-
gene and gene-environment interactions in genetic as-
sociation studies. Bio Data Mining, 4:4 .
Luna, J., Romero, J., and S., S. V. (2010). A gram-
mar guided genetic programming algorithm for min-
ing association rules. IEEE Congresso in Evolutionary
Computation (CEC). pp. 1-8.
Mata, J., Alvarez, J., and Riquelme, J. (2001). Mining nu-
meric association rules via evolutionary algorithms.
the 5th International Conference on Artificial Neural
Networks and Genetic Algorithms, Prague, Czech Re-
public, pp. 264-267.
McKinney, B., Reif, D., Ritchie, M., and Moore, J. (2006).
Machine learning for detecting gene-gene interac-
tions: a review. Appl. Bioinformatics, 5, 7788.
Moore, J. H. (2005). A global view of epistasis. Nat Genet.
37(1):13-4.
Motsinger, A., Ritchie, M., and Reif, D. (2007). Novel
methods for detecting epistasis in pharmacogenomics
studies. Pharmacogenomics, 8:1229-1241.
Motsinger-Reif, A., Deohdar, S., Winham, S., and Hardi-
son, N. (2010). Grammatical evolution decision trees
for detecting gene-gene interactions. BMC Bio Data
Mining.
Motsinger-Reif, A., Dudek, S., Hahn, L., and Ritchie,
M. (2008). Comparison of approaches for machine-
learning optimization of neural networks for detecting
gene-gene interaction in genetic epidemiology. Ge-
netic Epidemiol, 32:325-340.
O’Neill, M. and Ryan, C. (2003). Grammatical evolution:
Evolutionary automatic programming in an arbitrary
language. Boston: Kluwer Academic Publishers.
Salleb-Aouissi, A., Vrain, C., and Nortet, C. (2007). Quant-
miner: A genetic algorithm for mining quantitative as-
sociation rules. the 20th International Joint Confer-
ence on Artificial Intelligence, Hyberadad, India.
Steen, K. V. (2011). Travelling the world of gene-gene in-
teractions. Brief Bioinform 1-19.
Winham, S., Colby, C., Freimuth, R., Wang, X., de An-
drade, M., and Biernacka, J. (2012). Snp interaction
detection with random forests in high-dimensional
genetic data. BMC Bioinformatics, 13:164. doi:
10.1186/1471-2105-13-164.
BIOINFORMATICS2014-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms
258