tion (in the Chromosome 11) in YPS128 strain with
respect to RefSeq, in an unresolved sequence. Since
such insertion takes place less than 200b away from
an LTR annotated in RefSeq, we consider this event
as ”proximal” to a mobile element. This is relevant,
since several observations in the literature suggest that
transposons prefer to migrate in zones where there are
LTR. Finally, Fig. 2(f) shows an event of ”pINS Ty-c”
since the inserted sequence length is compatible with
a transposon.
These cases represent a complete spectrum of the
situations we have found in our screening. The fol-
lowing subsections report an aggregation of the data
we collected for all these categories of events.
Conserved Regions and Mobile Elements. More
than 95% in Y55 and 93% in YPS128 are conserved re-
gions. Most of these are part of the resident genome,
but not all of them. The fraction of conserved trans-
posons or LTRs (i.e. mobilome) within conserved re-
gions contains two possible elements: the truly con-
served transposons (only 1) or LTRs (in a relative low
number), which are annotated onto the RefSeq and
exactly mapped on the screened strain and the puta-
tive conservations of annotated transposons or LTRs,
which are mapped onto unresolved sequences in the
screened strain: in this case, a direct attribution is
impossible. The pCONSu are always found in the
telomeres because the presence of long repeats is a
source of noise for the assembly phase. In all cases
but one, telomeres do not involve sequences related
with mobile elements. Concerning pCONSu that are
outside the telomeres, the number of unresolved se-
quences that are located in correspondence or in prox-
imity of mobile elements, is greater than 90% for Y55
and around 70% for YPS128. This supports the hy-
pothesis that unresolved regions are often located in
correspondence of a mobile element (annotated onto
RefSeq).
Deletions. Deletions concern almost only the mo-
bilome. In Y55 strain, for instance, there are four pu-
tative deleted regions on RefSeq that do not corre-
spond to annotated Ty or soloLTR (against more than
90 pDELs corresponding to mobilome annotations).
We found that the length of the two regions is com-
patible to that of a soloLTR. This evidence strongly
suggests that in genomes closely related, the only sig-
nificant (i.e., for our work, those involving sequences
at least 200 long) chromosomal rearrangements are
due to the mobilome.
Insertions. The landscape for the putative inser-
tions, without (pINS) and with (pINSu) unresolved
regions is rich. We may label the inserted sequences
by proximity to annotated Ty or soloLTR in the inser-
tion site on RefSeq: from 40% to 50% of the cases,
it is a putative mobilome-proximal insertion. As for
deletions, we may also distinguish on the basis of in-
serted sequence length: Ty-c, LTR-c or in-between.
Also in this case, the large majority of events are con-
cerned with the mobilome. Even if this result is partly
derived from our classification methods, it supports
the same conclusion anyway.
5 FUTURE DIRECTIONS
We proposed an approach aimed at a rapid and ef-
ficient localization of the resident genome through
algorithm REGENDER. We shall generalize this ap-
proach to a multiple comparison extracting all the
chromosomal rearrangements on a dataset of 39
strains of the same specie S. cerevisiæ (Liti et al.,
2009). Our long term goal is to develop a model able
to describe the dynamics of the mobilome in these
strains.
ACKNOWLEDGEMENTS
We thank Emiliano Biscardi for performing tests
whose results are reported in Table 2.
REFERENCES
Cohen, J. D. (1997). Recursive hashing functions for n-
grams. ACM Trans. Inf. Syst., 15(3):291–320.
Conti, V., Aghaie, A., Cilli, M., and et al. (2006). crv4,
a mouse model for human ataxia associated with
kyphoscoliosis caused by an mrna splicing mutation
of the metabotropic glutamate receptor 1 (grm1). Int.
J. Molec. Med., 18:593–600.
Le Rouzic, A., Boutin, T. S., and Capy, P. (2007). Long
term evolution of transposable elements. PNAS,
104:19375–19380.
Lewin, B. (2007). Genes (IX ed.). Jones and Bartlett.
Liti, G., Carter, D. M., Moses, A. M., and et al. (2009). Pop-
ulation genomics of domestic and wild yeast. Nature,
458:337–341.
Ohlebusch, E. and Abouelhoda, M. (2006). Chaining al-
gorithms and applications in comparative genomics.
Handbook of Computational Molecular Biology.
SGD (2010). SGD project. Saccharomyces Genome
Database. http://www.yeastgenome.org.
Venner, S., Feschotte, C., and Biemont, C. (2009). Dynam-
ics of transposable elements: towards a community
ecology of the genome. Trends Genet., 25:317–323.
Vigna, S. (2006). fastutil: Fast and compact type-specific
collections for java. http://fastutil.dsi.unimi.it.
BIOINFORMATICS 2011 - International Conference on Bioinformatics Models, Methods and Algorithms
136