Authors:
Hao Wan
;
Gregory Barrett
;
Carolina Ruiz
and
Elizabeth F. Ryder
Affiliation:
Worcester Polytechnic Institute, United States
Keyword(s):
Gene Expression, C. elegans, Transcription Factor, Association Rule, Position Weight Matrix
Related
Ontology
Subjects/Areas/Topics:
Algorithms and Software Tools
;
Bioinformatics
;
Biomedical Engineering
;
Data Mining and Machine Learning
;
Genomics and Proteomics
;
Sequence Analysis
Abstract:
Gene expression in different cells is regulated by different sets of transcription factors. How the combinations of transcription factors required to achieve specificity of expression are encoded by regulatory regions of DNA is a long-standing problem in biology. In the model system C. elegans, gene regulatory regions are relatively compact, and much work has been done to describe gene expression patterns in a number of cell types. In this work, we collected the promoter regions of genes with known expression patterns in a limited number of neuronal cell types, and annotated any DNA motifs in the promoters that corresponded to putative binding sites of known C. elegans transcription factors, using position weight matrices. We used association rule mining to identify rules relating the presence of particular motifs with expression of particular genes. We used metrics including confidence, support, lift, and p-value to mine and assess rules. We examined the effect on the rules of
multiple vs. single transcription factors, and the effect of distance from transcription factor binding sites to the start of transcription. The mined association rules were filtered by Benjamini and Hochberg’s approach, and the most interesting rules were selected. We also validated our approach by generating association rules corresponding to gene expression patterns which have been already revealed in biological research. We conclude that our system allows the identification of interesting putative gene expression rules involving known transcription factors. These rules can be further validated using biological techniques.
(More)