A GENERALIZED HIDDEN MARKOV MODEL FOR PREDICTION OF CIS-REGULATORY MODULES IN EUKARYOTE GENOMES AND DESCRIPTION OF THEIR INTERNAL STRUCTURE
Anna A. Nilulova, Alexander V. Favorov, Vsevolod Yu. Makeev, Andrey A. Mironov
2012
Abstract
Eukaryotic regulatory regions have been studied extensively due to their importance for gene regulation in higher eukaryotes. However, the understanding of their organization is clearly incomplete. In particular, we lack accurate in silico methods for their prediction. Here we present a new HMM-based method for the prediction of regulatory regions in eukaryotic genomes using position weight matrices of the relevant transcription factors. The method reveals and then utilizes the regulatory region structure (preferred binding site arrangements) to increase the quality of the prediction, as well as to provide a new knowledge of the regulatory region organization. We show that our method can be successfully used for the identification of regulatory regions in eukaryotic genomes with a quality higher than that of other methods. We also demonstrate the ability of our algorithm to reveal structural features of the regulatory regions, which could be helpful for the deciphering of the transcriptional regulation mechanisms in higher eukaryotes.
References
- Moore, R., Lopes, J., 1999. Paper templates. In TEMPLATE'06, 1st International Conference on Template Production. SciTePress.
- Smith, J., 1998. The book, The publishing company. London, 2nd edition.
- Aerts, S., Van Loo, P., Thijs, G., Moreau, Y., De Moor, B., 2003. Computational detection of cis -regulatory modules. In Bioinformatics, 19 Suppl 2.
- Bailey, T. L., Noble, W. S., 2003. Searching for statistically significant regulatory modules. In Bioinformatics, 19 Suppl 2.
- Baum L., 1972. An equality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. In Inequalities, 3.
- Fariselli, P., Martelli, P. L., Casadio, R., 2005. A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins. In BMC Bioinformatics.
- Frith, M. C., Hansen, U., Weng, Z., 2001. Detection of cis-element clusters in higher eukaryotic DNA. In Bioinformatics, 6 Suppl 4.
- Frith, M. C., Hansen, U., Weng, Z., 2001. Detection of cis-element clusters in higher eukaryotic DNA. In Bioinformatics,17, no. 10.
- Frith, M. C., Li, M. C., Weng, Z., 2003. Cluster-Buster: finding dense clusters of motifs in DNA sequences. In Nucleic Acids Research, 31, no. 13.
- Halfon, M. S., Gallo, S. M., Bergman, C. M., 2008. REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila. In Nucleic Acids Research, 36.
- Hallikas, O., Palin, K., Sinjushina, N., Rautiainen, R., Partanen, J., Ukkonen, E., Taipale, J., 2006. Genome-wide Prediction of Mammalian Enhancers Based on Analysis of Transcription-Factor Binding Affinity. In Cell, 124(1).
- Hu, J., Hu, H., Li, X., 2008. MOPAT: a graph-based method to predict recurrent cis-regulatory modules from known motifs. In Nucleic Acids Research, 36(13).
- Johansson, O., Alkema, W., Wasserman, W. W., Lagergren, J., 2003. Identification of functional clusters of transcription factor binding motifs in genome sequences: the MSCAN algorithm. In Bioinformatics, 19 Suppl 1.
- Kel, A., Konovalova, T., Waleev, T., Cheremushkin, E., Kel-Margoulis, O., Wingender, E., 2006. Composite Module Analyst: a fitness-based tool for identification of transcription factor binding site combinations. In Bioinformatics, 22(10).
- Klepper, K., Sandve, G. K., Abul, O., Johansen, J., Drablos, F., 2008. Assessment of composite motif discovery methods. In BMC Bioinformatics, 9.
- Lebrecht, D., Foehr, M., Smith, E., Lopes, F. J. P., Vanario-Alonso, C. E., Reinitz, J., Burz, D. S., et al., 2005. Bicoid cooperative DNA binding is critical for embryonic patterning in Drosophila. In Proceedings of the National Academy of Sciences of the United States of America, 102(37).
- Maeda, T., Gupta, M. P., Stewart, A. F. R., 2002. TEF-1 and MEF2 transcription factors interact to regulate muscle-specific promoters. In Biochemical and Biophysical Research Communications, 294(4).
- Makeev, V. J., Lifanov, A. P., Nazina, A. G., Papatsenko, D. A., 2003. Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information. In Nucleic Acids Research, 31(20).
- Matys, V., Kel-Margoulis, O. V., Fricke, E., Liebich, I., Land, S., Barre-Dirrie, A., Reuter, I., et al., 2006. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. In Nucleic Acids Research, 34.
- Papatsenko, D., Goltsev, Y., Levine, M., 2009. Organization of developmental enhancers in the Drosophila embryo. In Nucleic Acids Research, 37, no. 17.
- Rabiner, L. R., 1989. A tutorial on hidden markov models and selected applications in speech recognition. In PROCEEDINGS OF THE IEEE, 77.
- Sinha, S., van Nimwegen, E., Siggia, E. D., 2003. A probabilistic method to detect regulatory modules. In Bioinformatics, 19 Suppl 1.
- Wasserman, W. W., Fickett, J. W., 1998. Identification of regulatory regions which confer muscle-specific gene expression. In Journal of Molecular Biology, 278(1).
- Zhou, Q., Wong, W. H., 2004. CisModule: De novo discovery of cis-regulatory modules by hierarchical mixture modeling. In Proceedings of the National Academy of Sciences of the United States of America, 101(33).
Paper Citation
in Harvard Style
A. Nilulova A., V. Favorov A., Yu. Makeev V. and A. Mironov A. (2012). A GENERALIZED HIDDEN MARKOV MODEL FOR PREDICTION OF CIS-REGULATORY MODULES IN EUKARYOTE GENOMES AND DESCRIPTION OF THEIR INTERNAL STRUCTURE . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012) ISBN 978-989-8425-90-4, pages 34-41. DOI: 10.5220/0003735800340041
in Bibtex Style
@conference{bioinformatics12,
author={Anna A. Nilulova and Alexander V. Favorov and Vsevolod Yu. Makeev and Andrey A. Mironov},
title={A GENERALIZED HIDDEN MARKOV MODEL FOR PREDICTION OF CIS-REGULATORY MODULES IN EUKARYOTE GENOMES AND DESCRIPTION OF THEIR INTERNAL STRUCTURE},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)},
year={2012},
pages={34-41},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003735800340041},
isbn={978-989-8425-90-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)
TI - A GENERALIZED HIDDEN MARKOV MODEL FOR PREDICTION OF CIS-REGULATORY MODULES IN EUKARYOTE GENOMES AND DESCRIPTION OF THEIR INTERNAL STRUCTURE
SN - 978-989-8425-90-4
AU - A. Nilulova A.
AU - V. Favorov A.
AU - Yu. Makeev V.
AU - A. Mironov A.
PY - 2012
SP - 34
EP - 41
DO - 10.5220/0003735800340041