Authors:
Jaime I. Lopez-Veyna
;
Victor J. Sosa-Sosa
and
Ivan Lopez-Arevalo
Affiliation:
Center of Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV), Mexico
Keyword(s):
Keyword Search, Indexing, Databases, Top-k, Virtual Documents.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Business Analytics
;
Data Engineering
;
Information Retrieval
;
Ontologies and the Semantic Web
;
Pattern Recognition
;
Semi-Structured and Unstructured Data
;
Software Engineering
Abstract:
It is clear that in recent years the amount of information available in a variety of data sources, like those found on the Web, has presented an accelerated growth. This information can be classified based on its structure in three different forms: unstructured (free text documents), semi-structured (XML documents) and structured (a relational database or XML database). A search technique that has gained wide acceptance for use in massive data sources, such as the Web, is the keyword based search, which is simple to people who are familiar with the use of Web search engines. Keyword search has become an alternative to users without any knowledge about formal query languages and schema used in structured data. There are some traditional approaches to perform keyword search over relational databases such as Steiner Trees, Candidate Networks and recently Tuple Units. Nevertheless these methods have some limitations. In this paper we propose a Virtual Document (VD) approach for keyword s
earch in databases. We represent the structured information as graphs and propose the use of an index that captures the structural relationships of the information. This approach produce fast and accuracy results in search responses. We have conducted extensive experiments on large-scale real databases and the results demonstrates that our approach achieves high search efficiency and high accuracy for keyword search in databases.
(More)