Figure 2: Oryza Tag Line published by Le Select.
global conceptual model, we would have had to trans-
form the schema of the sources before publication.
But Le Select also offers a mechanism for viewing
published sources. For the mediator, the views are
also wrappers that execute a query on an already-
published source. It is this feature that we have used
by creating views of tables that have to be trans-
formed. This establishes, in a simple manner, the cor-
respondences between the OTL and BRC-db database
schemas.
4 CONCLUSION
The information systems that researchers in func-
tional genomics have to put together to fulfil their re-
quirements need to preserve the resources’ indepen-
dence and, very often, the confidentiality of at least a
part of their information.
The mediation solution is thus of relevance; it con-
serves the resources’ independence while allowing
their distribution and it provides uniform access to in-
formation. In fact, even though the materialized ap-
proach is also a robust one, it does not handle well the
changing character of genomic sources. Unavoidable
changes in both systems, OTL and BRC-db, would
entail numerous changes to the schema of the data
warehouse and, subsequently, to the procedures for
loading the underlying data.
The solution implemented using Le Select takes
changes in the Oryza Tag Line schema in stride; they
are incorporated directly by the mediator. And, by
using the intermediary of views, the establishment of
new correspondences with BRC-db is also relatively
easy. We can thus think that the approach we propose
is transferable to other functional genomics applica-
tions. On the longer term, aside from incorporating
the integration of programs, the systems should allow
researchers to conduct online analyses by authorizing
procedures on the data (access to both types of re-
sources having been made transparent).
REFERENCES
A. Duclert et al (2005). Bioinformatics in Genoplante.
Plant Genomics European Meetings proceedings.
Cavalcanti, M. C., Mattoso, M., Campos, M. L., Llirbat, F.,
and Simon, E. (2002). Sharing scientific models in
environmental applications. In SAC ’02: Proceedings
of the 2002 ACM symposium on Applied computing,
pages 453–457, New York, NY, USA. ACM Press.
D. Samson et al (2005). GpiIS: Towards an integrated infor-
mation system around plant genomes. Plant Genomics
European Meetings proceedings.
Davidson, S., Overton, C., and Buneman, P. (1995). Chal-
lenges in integrating biological data sources. J Com-
put Biol, 2(4):557–72.
Eckman, B., Lacroix, Z., and Raschid, L. (2001). Optimized
seamless integration of biomolecular data. IEEE sym-
posium on Bio-Informatics and Biomedical Engineer-
ing (BIBE’01), Washington DC, pages 23–32.
Galperin, M. (2004). The Molecular Biology Database Col-
lection: 2004 update. Nucleic Acids Res, 32(Database
issue):D3–22.
Gouy, M., Gautier, C., Attimonelli, M., Lanave, C., and
di Paola, G. (1985). ACNUC–a portable retrieval
system for nucleic acid sequence databases: logical
and physical designs and usage. Comput Appl Biosci,
1(3):167–72.
Gruber, T. (1993). Towards principles for the design of on-
tologies used for sharing. The International Workshop
on Formal Ontology.
Karp, P. (1995). A strategy for database interoperation. J
Comput Biol, 2(4):573–86.
Karp, P. (2003). What database management system(s)
should be employed in bioinformatics applications?
OMICS, 7(1):35–6.
M.A. Harris et al (2004). The Gene Ontology (GO) data-
base and informatics resource. Nucleic Acids Res,
32(Database issue):D258–61.
Manolescu, I., Bouganim, L., Fabret, F., and Simon, E. (Jan
2002). Efficient querying of distributed resources in
mediator systems. In Lecture Notes in Computer Sci-
ence, volume 2519, pages 468 – 485.
P. Lord et al (2004). Applying semantic web services
to bioinformatics experiences gained, lessons learnt.
ISWC Springer-Verlag Berlin Heidelberg, pages 350–
364.
Sallaud, C., Gay, C., Larmande, P., Bes, M., Piffanelli, P.,
Piegu, B., Droc, G., Regad, F., Bourgeois, E., Mey-
nard, D., Perin, C., Sabau, X., Ghesquiere, A., Glasz-
mann, J., Delseny, M., and Guiderdoni, E. (2004).
High throughput T-DNA insertion mutagenesis in
INTEGRATION OF DATA SOURCES FOR PLANT GENOMICS
317