new web applications denominated locus specific
databases and they usually contain information
about sequence variations among individuals for a
particular gene. In addition, content ownership and
its growing importance is gaining relevance. Despite
the fact that for regular end-users, access to
scientific content is easier when provided by a
centralized service, researchers who want to publish
their work are almost obliged to create their own
applications if they want to keep the authorship of
their work visible.
The described architecture and application intend
to overcome these problems with three key features
for both users and researchers. First, integration is
based on simple Internet URLs that are parsed and
processed to gather the most significant information.
This means that developers will not have to make
any changes to the application core and that we are
able to integrate any URL-accessible content.
Secondly, the original applications will be shown
inside our application. Thus, the content owners will
not be shown as a link but as part of a complete
application. Finally, external applications can be
extended inside our system: information exchanges,
text-mining and other user customization features
can be developed to enhance the original
applications.
ACKNOWLEDGEMENTS
The research leading to these results has received
funding from the European Community's Seventh
Framework Programme (FP7/2007-2013) under
grant agreement nº 200754 - the GEN2PHEN
project.
REFERENCES
Adams, M. D., Kelley, J. M., et al. (1991) Complementary
DNA sequencing: expressed sequence tags and human
genome project. Science, 252, 1651-1656.
Al, B. E. T. & Junien, C. (2000) UMD (Universal
Mutation Database): A Generic Software to Build and
Analyze Locus-Specific Databases. Human Mutation,
94.
Arrais, J., Santos, B., et al. (2007) GeneBrowser: an
approach for integration and functional classification
of genomic data. Journal of Integrative
Bioinformatics, 4.
Bairoch, A., Apweiler, R., et al. (2005) The Universal
Protein Resource (UniProt). Nucleic Acids Research,
33, 0-159.
Belleau, F., Nolin, M.-A., et al. (2008) Bio2RDF: Towards
a mashup to build bioinformatics knowledge systems.
Journal of Biomedical Informatics, 41, 706-716.
Collins, F. S., Patrinos, A., et al. (1998) New Goals for the
U.S. Human Genome Project: 1998-2003. Science,
282, 682-689.
Cotton, R. G. H., Auerbach, A. D., et al. (2008)
GENETICS: The Human Variome Project. Science,
322, 861-862.
Edgar, R., Domrachev, M. & Lash, A. E. (2002) Gene
Expression Omnibus: NCBI gene expression and
hybridization array data repository. Nucleic Acids
Research, 30, 207-210.
Fokkema, I. F., Den Dunnen, J. T. & Taschner, P. E.
(2005) LOVD: easy creation of a locus-specific
sequence variation database using an "LSDB-in-a-
box" approach. Human Mutation, 26, 63-68.
Haas, L. M., Schwarz, P. M., et al. (2001) DiscoveryLink:
A system for integrated access to life sciences data
sources. IBM Systems Journal, 40, 489-511.
Hamosh, A., Scott, A. F., et al. (2005) Online Mendelian
Inheritance in Man (OMIM), a knowledgebase of
human genes and genetic disorders. Nucleic Acids
Research, 33, 514-517.
Hubbard, T., Barker, D., et al. (2002) The Ensembl
genome database project. Nucleic Acids Research, 30,
38-41.
Lopes, P., Arrais, J. & Oliveira, J. L. (2008) Dynamic
Service Integration using Web-based Workflows.
Proceedings of the 10th Internation Conference on
Information Integration and Web Applications &
Services. Linz, Austria, Association for Computer
Machinery.
Maglott, D., Ostell, J., et al. (2007) Entrez Gene: gene-
centered information at NCBI. Nucleic Acids
Research, 35.
Oinn, T., Addis, M., et al. (2004) Taverna: a tool for the
composition and enactment of bioinformatics
workflows. Bioinformatics, 20, 3045-3054.
Oliveira, J. L., Dias, G. M. S., et al. (2004) DiseaseCard:
A Web-based Tool for the Collaborative Integration of
Genetic and Medical Information. Proceedings of the
5th International Symposium on Biological and
Medical Data Analysis, ISBMDA 2004. Barcelona,
Spain, Springer.
Polyzotis, N., Skiadopoulos, S., et al. (2008) Meshing
Streaming Updates with Persistent Data in an Active
Data Warehouse. Knowledge and Data Engineering,
IEEE Transactions on, 20, 976-991.
Pruitt, K. D. & Maglott, D. R. (2001) RefSeq and
LocusLink: NCBI gene-centered resources. Nucleic
Acids Research, 29, 137-140.
Reddy, S. S. S., Reddy, L. S. S., et al. (2009) Advanced
Techniques for Scientific Data Warehouses. Advanced
Computer Control, 2009. ICACC '09. International
Conference on.
Zhu, Y., An, L. & Liu, S. (2008) Data Updating and Query
in Real-Time Data Warehouse System. Computer
Science and Software Engineering, 2008 International
Conference on.
LINK INTEGRATOR - A Link-based Data Integration Architecture
277