Statistical Identification of Co-regulatory Gene Modules using Multiple ChIP-Seq Experiments

Xi Chen, Xu Shi, Ayesha N. Shajahan-Haq, Leena Hilakivi-Clarke, Robert Clarke, Jianhua Xuan


ChIP-Seq experiments provide accurate measurements of the regulatory roles of transcription factors (TFs) under specific condition. Downstream target genes can be detected by analyzing the enriched TF binding sites (TFBSs) in genes’ promoter regions. The location and statistical information of TFBSs make it possible to evaluate the relative importance of each binding. Based on the assumption that the TFBSs of one ChIP-Seq experiment follow the same specific location distribution, a statistical model is first proposed using both location and significance information of peaks to weigh target genes. With genes’ binding scores from different TFs, we merge them into a weighted binding matrix. A Markov Chain Monte Carlo (MCMC) based approach is then applied to the binding matrix for co-regulatory module identification. We demonstrate the efficiency of our statistical model on an ER-α ChIP-Seq dataset and further identify co-regulatory modules by using eleven breast cancer related TFs from ENCODE ChIP-Seq datasets. The results show that the TFs in individual module regulate common high score target genes; the association of TFs is biologically meaningful, and the functional roles of TFs and target genes are consistent.


  1. Amati, B. and Land, H. (1994) Myc-Max-Mad: a transcription factor network controlling cell cycle progression, differentiation and death, Current opinion in genetics & development, 4, 102-108.
  2. Bailey, T. L., et al. (2006) MEME: discovering and analyzing DNA and protein sequence motifs, Nucleic acids research, 34, W369-373.
  3. Bjornsti, M. A. and Houghton, P. J. (2004) The TOR pathway: a target for cancer therapy, Nature reviews. Cancer, 4, 335-348.
  4. Bochkis, I. M., et al. (2012) Genome-wide location analysis reveals distinct transcriptional circuitry by paralogous regulators Foxa1 and Foxa2, PLoS genetics, 8, e1002770.
  5. Bossy-Wetzel, E., Bakiri, L. and Yaniv, M. (1997) Induction of apoptosis by the transcription factor cJun, The EMBO journal, 16, 1695-1709.
  6. Cheng, C., Min, R. and Gerstein, M. (2011) TIP: a probabilistic method for identifying transcription factor target genes from ChIP-seq binding profiles, Bioinformatics, 27, 3221-3227.
  7. Dunham, I., et al. (2012) An integrated encyclopedia of DNA elements in the human genome, Nature, 489, 57- 74.
  8. Frey, B. J. and Dueck, D. (2007) Clustering by passing messages between data points, Science, 315, 972-976.
  9. Gerstein, M. B., et al. (2012) Architecture of the human regulatory network derived from ENCODE data, Nature, 489, 91-100.
  10. Heinz, S., et al. (2010) Simple combinations of lineagedetermining transcription factors prime cis-regulatory elements required for macrophage and B cell identities, Molecular cell, 38, 576-589.
  11. Hurtado, A., et al. (2011) FOXA1 is a key determinant of estrogen receptor function and endocrine response, Nature genetics, 43, 27-33.
  12. Ihmels, J., Bergmann, S. and Barkai, N. (2004) Defining transcription modules using large-scale gene expression data, Bioinformatics, 20, 1993-2003.
  13. Li, Q. and Dashwood, R. H. (2004) Activator protein 2alpha associates with adenomatous polyposis coli/beta-catenin and Inhibits beta-catenin/T-cell factor transcriptional activity in colorectal cancer cells, The Journal of biological chemistry, 279, 45669-45675.
  14. Liu, R., et al. (2009) Transcription factor specificity protein 1 (SP1) and activating protein 2alpha (AP2alpha) regulate expression of human KCTD10 gene by binding to proximal region of promoter, The FEBS journal, 276, 1114-1124.
  15. McLean, C. Y., et al. (2010) GREAT improves functional interpretation of cis-regulatory regions, Nature biotechnology, 28, 495-501.
  16. Park, P. J. (2009) ChIP-seq: advantages and challenges of a maturing technology, Nature reviews. Genetics, 10, 669-680.
  17. Ren, B., et al. (2002) E2F integrates cell cycle progression with DNA repair, replication, and G(2)/M checkpoints, Genes & development, 16, 245-256.
  18. Ross-Innes, C. S., et al. (2010) Cooperative interaction between retinoic acid receptor-alpha and estrogen receptor in breast cancer, Genes & development, 24, 171-182.
  19. Salmon-Divon, M., et al. (2010) PeakAnalyzer: genomewide annotation of chromatin binding and modification loci, BMC bioinformatics, 11, 415.
  20. Schultz, D. J., et al. (2010) Anacardic acid inhibits estrogen receptor alpha-DNA binding and reduces target gene transcription and breast cancer cell proliferation, Molecular cancer therapeutics, 9, 594- 605.
  21. Segal, E., et al. (2003) Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data, Nature genetics, 34, 166-176.
  22. Stender, J. D., et al. (2010) Genome-wide analysis of estrogen receptor alpha DNA binding and tethering mechanisms identifies Runx1 as a novel tethering factor in receptor-mediated transcriptional activation, Molecular and cellular biology, 30, 3943-3955.
  23. Su, J., Teichmann, S. A. and Down, T. A. (2010) Assessing computational methods of cis-regulatory module prediction, PLoS computational biology, 6, e1001020.
  24. Turner, H. L., et al. (2005) Biclustering models for structured microarray data, IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM, 2, 316-329.
  25. Wasylyk, C., et al. (2002) Sp100 interacts with ETS-1 and stimulates its transcriptional activity, Molecular and cellular biology, 22, 2687-2702.

Paper Citation

in Harvard Style

Chen X., Shi X., N. Shajahan-Haq A., Hilakivi-Clarke L., Clarke R. and Xuan J. (2014). Statistical Identification of Co-regulatory Gene Modules using Multiple ChIP-Seq Experiments . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014) ISBN 978-989-758-012-3, pages 109-116. DOI: 10.5220/0004736801090116

in Bibtex Style

author={Xi Chen and Xu Shi and Ayesha N. Shajahan-Haq and Leena Hilakivi-Clarke and Robert Clarke and Jianhua Xuan},
title={Statistical Identification of Co-regulatory Gene Modules using Multiple ChIP-Seq Experiments},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)},

in EndNote Style

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)
TI - Statistical Identification of Co-regulatory Gene Modules using Multiple ChIP-Seq Experiments
SN - 978-989-758-012-3
AU - Chen X.
AU - Shi X.
AU - N. Shajahan-Haq A.
AU - Hilakivi-Clarke L.
AU - Clarke R.
AU - Xuan J.
PY - 2014
SP - 109
EP - 116
DO - 10.5220/0004736801090116