
new web applications denominated locus specific 
databases and they usually contain information 
about sequence variations among individuals for a 
particular gene. In addition, content ownership and 
its growing importance is gaining relevance. Despite 
the fact that for regular end-users, access to 
scientific content is easier when provided by a 
centralized service, researchers who want to publish 
their work are almost obliged to create their own 
applications if they want to keep the authorship of 
their work visible. 
The described architecture and application intend 
to overcome these problems with three key features 
for both users and researchers. First, integration is 
based on simple Internet URLs that are parsed and 
processed to gather the most significant information. 
This means that developers will not have to make 
any changes to the application core and that we are 
able to integrate any URL-accessible content. 
Secondly, the original applications will be shown 
inside our application. Thus, the content owners will 
not be shown as a link but as part of a complete 
application. Finally, external applications can be 
extended inside our system: information exchanges, 
text-mining and other user customization features 
can be developed to enhance the original 
applications. 
ACKNOWLEDGEMENTS 
The research leading to these results has received 
funding from the European Community's Seventh 
Framework Programme (FP7/2007-2013) under 
grant agreement nº 200754 - the GEN2PHEN 
project. 
REFERENCES 
Adams, M. D., Kelley, J. M., et al. (1991) Complementary 
DNA sequencing: expressed sequence tags and human 
genome project. Science, 252, 1651-1656. 
Al, B. E. T. & Junien, C. (2000) UMD (Universal 
Mutation Database): A Generic Software to Build and 
Analyze Locus-Specific Databases. Human Mutation, 
94. 
Arrais, J., Santos, B., et al. (2007) GeneBrowser: an 
approach for integration and functional classification 
of genomic data. Journal of Integrative 
Bioinformatics, 4. 
Bairoch, A., Apweiler, R., et al. (2005) The Universal 
Protein Resource (UniProt). Nucleic Acids Research, 
33, 0-159. 
Belleau, F., Nolin, M.-A., et al. (2008) Bio2RDF: Towards  
a mashup to build bioinformatics knowledge systems. 
Journal of Biomedical Informatics, 41, 706-716. 
Collins, F. S., Patrinos, A., et al. (1998) New Goals for the 
U.S. Human Genome Project: 1998-2003. Science, 
282, 682-689. 
Cotton, R. G. H., Auerbach, A. D., et al. (2008) 
GENETICS: The Human Variome Project. Science, 
322, 861-862. 
Edgar, R., Domrachev, M. & Lash, A. E. (2002) Gene 
Expression Omnibus: NCBI gene expression and 
hybridization array data repository. Nucleic Acids 
Research, 30, 207-210. 
Fokkema, I. F., Den Dunnen, J. T. & Taschner, P. E. 
(2005) LOVD: easy creation of a locus-specific 
sequence variation database using an "LSDB-in-a-
box" approach. Human Mutation, 26, 63-68. 
Haas, L. M., Schwarz, P. M., et al. (2001) DiscoveryLink: 
A system for integrated access to life sciences data 
sources. IBM Systems Journal, 40, 489-511. 
Hamosh, A., Scott, A. F., et al. (2005) Online Mendelian 
Inheritance in Man (OMIM), a knowledgebase of 
human genes and genetic disorders. Nucleic Acids 
Research, 33, 514-517. 
Hubbard, T., Barker, D., et al. (2002) The Ensembl 
genome database project. Nucleic Acids Research, 30, 
38-41. 
Lopes, P., Arrais, J. & Oliveira, J. L. (2008) Dynamic 
Service Integration using Web-based Workflows. 
Proceedings of the 10th Internation Conference on 
Information Integration and Web Applications & 
Services. Linz, Austria, Association for Computer 
Machinery. 
Maglott, D., Ostell, J., et al. (2007) Entrez Gene: gene-
centered information at NCBI. Nucleic Acids 
Research, 35. 
Oinn, T., Addis, M., et al. (2004) Taverna: a tool for the 
composition and enactment of bioinformatics 
workflows. Bioinformatics, 20, 3045-3054. 
Oliveira, J. L., Dias, G. M. S., et al. (2004) DiseaseCard: 
A Web-based Tool for the Collaborative Integration of 
Genetic and Medical Information. Proceedings of the 
5th International Symposium on Biological and 
Medical Data Analysis, ISBMDA 2004. Barcelona, 
Spain, Springer. 
Polyzotis, N., Skiadopoulos, S., et al. (2008) Meshing 
Streaming Updates with Persistent Data in an Active 
Data Warehouse. Knowledge and Data Engineering, 
IEEE Transactions on, 20, 976-991. 
Pruitt, K. D. & Maglott, D. R. (2001) RefSeq and 
LocusLink: NCBI gene-centered resources. Nucleic 
Acids Research, 29, 137-140. 
Reddy, S. S. S., Reddy, L. S. S., et al. (2009) Advanced 
Techniques for Scientific Data Warehouses. Advanced 
Computer Control, 2009. ICACC '09. International 
Conference on. 
Zhu, Y., An, L. & Liu, S. (2008) Data Updating and Query 
in Real-Time Data Warehouse System. Computer 
Science and Software Engineering, 2008 International 
Conference on. 
LINK INTEGRATOR - A Link-based Data Integration Architecture
277