Kundaje, A., Middendorf, M., Gao, F., Wiggins, C., and
Leslie, C. (2005). Combining sequence and time se-
ries expression data to learn transcriptional modules.
IEEE/ACM Trans. Comput. Biol. Bioinform., 2:194–
202.
Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S.,
Neuwald, A. F., and Wootton, J. C. (1993). Detecting
subtle sequence signals: a Gibbs Sampling strategy
for multiple alignment. Science, 262:208–214.
Liu, X., Brutlag, D., and Liu, J. (2001). Bioprospector:
discovering conserved DNA motifs in upstream regu-
latory regions of co-expressed genes. In Pac. Symp.
Biocomput., pages 127–138.
Marinescu, V., Kohane, I., and Riva, A. (2005). The MAP-
PER database: a multi-genome catalog of putative
transcription factor binding sites. Nucleic Acids Res.,
33D:D91–D97.
Matys, V., Kel–Margoulis, O.V., Fricke, E. et al. (2006).
TRANSFAC
R
and its module TRANSCompel
R
:
transcriptional gene regulation in eukaryotes. Nucleic
Acids Res., 34:D108–D110.
Mehldau, G. and Myers, G. (1993). A system for pat-
tern matching applications on biosequences. Comput.
Appl. Biosci., 9:299–314.
Miller, W. (2001). Comparison of genomic DNA se-
quences: solved and unsolved problems. Bioinformat-
ics, 17:391–397.
Nelson, C., Hersh, B., and Carroll, S. B. (2004). The reg-
ulatory content of intergenic DNA shapes genome ar-
chitecture. Genome Biol., 5:R25.
Papatsenko, D. (2007). ClusterDraw web server: a tool to
identify and visualize clusters of binding motifs for
transcription factors. Bioinformatics, 23:1032–1034.
Pierstorff, N., Bergman, C. M., and Wiehe, T. (2006). Iden-
tifying cis–regulatory modules by combining compar-
ative and compositional analysis of DNA. Bioinfor-
matics, 22:2858–2864.
Qin, Z., McCue, L., Thompson, W., Mayerhofer, L.,
Lawrence, C., and Liu, J. (2003). Identification of co-
regulated genes through Bayesian clustering of pre-
dicted regulatory binding sites. Nature Biotechnology,
21(4):435–439.
Rebeiz, M., Reeves, N. L., and Posakony, J. W. (2002).
SCORE: A computational approach to the identifi-
cation of cis–regulatory modules and target genes in
whole–genome sequence data. Proc. Natl. Acad. Sci.
USA, 99(15):9888–9893.
Rijnkels, M., Elnitski, L., Miller, W., and Rosen, J. M.
(2003). Multispecies comparative analysis of a
mammalian–specific genomic domain encoding se-
cretory proteins. Genomics, 82:417–432.
Schneider, T. D., Stormo, G. D., Gold, L., and Ehrenfeucht,
A. (1986). Information content of binding sites on
nucleotide sequences. J. Mol. Biol., 188:415–431.
Schones, D. E., Smith, A. D., and Zhang, M. Q. (2007). Sta-
tistical significance of cis-regulatory modules. BMC
Bioinformatics, 8:19.
Segal, E., Fondufe–Mittendorf, Y., Chen, L., Thastrom, A.,
Field, Y., Moore, I. K., Wang, J.-P. Z., and Widom, J.
(2006). A genomic code for nucleosome positioning.
Nature, 442:772–778.
Sharan, R., Ovcharenko, I., Ben-Hur, A., and Karp, R.
(2003). CREME: a framework for identifying cis–
regulatory modules in human–mouse conserved seg-
ments. Bioinformatics, 19:i283–i291.
Singh, A., Feschotte, C., and Stojanovic, N. (2007). A study
of the repetitive structure and distribution of short mo-
tifs in human genomic sequences. Int. J. Bioinformat-
ics Research and Applications, 3:523–535.
Sinha, S., Schroeder, M., Unnerstall, U., Gaul, U., and
Siggia, E. (2004). Cross–species comparison sig-
nificantly improves genome–wide prediction of cis–
regulatory modules in drosophila. BMC Bioinformat-
ics., 5:129.
Sinha, S., vanNimwegen, E., and Siggia, E. (2003). A prob-
abilistic method to detect regulatory modules. Bioin-
formatics, 19:i292–i301.
Smit, A. (1999). Interspersed repeats and other memen-
tos of transposable elements in mammalian genomes.
Curr. Opin. Genet. Dev., 9:657–663.
Stojanovic, N. (2004). Computational methods for the anal-
ysis of differential conservation in groups of similar
DNA sequences. Genome Informatics, 15:21–30.
Stojanovic, N. and Dewar, K. (2005). A probabilistic ap-
proach to the assessment of phylogenetic conservation
in mammalian Hox gene clusters. In Proceedings of
the BIOINFO 2005, International Joint Conference of
InCoB, AASBi and KSBI, pages 118–123.
Thijs, G., Marchal, K., Lescot, M., Rombauts, S., De Moor,
B., Rouze, P., and Moreau, Y. (2002). A Gibbs sam-
pling method to detect overrepresented motifs in the
upstream regions of coexpressed genes. J. Comput.
Biol., 9(2):447–464.
Thomas, J.W., Touchman, J.W., Blakesley, R.W. et al.
(2003). Comparative analysis of multi-species se-
quences from targeted genomic regions. Nature,
424:788–793.
Tompa, M., Li, N., and Bailey, T.L. et al. (2005). As-
sessing computational tools for the discovery of tran-
scription factor binding sites. Nature Biotechnology,
23(1):137–144.
van Helden, J. (2004). Metrics for comparing regulatory se-
quences on the basis of pattern counts. Bioinformatics,
20:399–406.
van Helden, J., Andre, B., and Collado-Vides, J. (1998).
Extracting regulatory sites from the upstream region
of yeast genes by computational analysis of oligonu-
cleotide frequencies. J. Mol. Biol., 281:827–842.
Vlieghe, D., Sandelin, A., De Bleser, P. J., Vleminckx,
K., Wasserman, W. W., van Roy, F., and Lenhard,
B. (2006). A new generation of JASPAR, the open–
access repository for transcription factor binding site
profiles. Nucleic Acids Res., 34:D95–D97.
Waring, M. and Britten, R. (1966). Nucleotide sequence
repetition: a rapidly reassociating fraction of mouse
DNA. Science, 154:791–794.
ON THE FUTILITY OF INTERPRETING OVER-REPRESENTATION OF MOTIFS IN GENOMIC SEQUENCES AS
FUNCTIONAL SIGNALS
471