data as graph and have proposed the use of a data in-
dex (VDII) to capture the structural relationships for
fast and accuracy response. Finally, we have con-
ducted some experiments to evaluate the efficiency
and effectiveness of our approach using real data sets.
This experiments show that our approach achieves
high search efficiency and quality for keyword search
and is capable to scale with databases with tens of
millions of tuples.
Our future work includes the need to continue the
testing of our approach with other datasets including
semi-structurated and unstructured data. We also have
to continue working on some strategies to reduce of
the size of the index for example by the using of Mu-
tual Information.
ACKNOWLEDGEMENTS
This research was partially supported by project num-
ber 187325 from Fondo Mixto Conacyt-Gobierno del
Estado de Tamaulipas.
REFERENCES
Abiteboul, S. and Allard, T. (2008). Webcontent: Efficient
p2p warehousing of web data.
Achiezra, H. and Golenberg, K. (2010). Exploratory key-
word search on data graphs. In Proceedings of the
2010 international conference on Management of data
(SIGMOD), pages 1163–1166. ACM.
Agrawal, S. and Chaudhuri, S. (2002). Dbxplorer: A system
for keyword-based search over relational databases. In
Proceedings of the 18th International Conference on
Data Engineering, ICDE ’02. IEEE Computer Soci-
ety.
Bao, Z. and Lu, J. (2010). Towards an effective xml key-
word search. IEEE Transactions on Knowledge and
Data Engineering, 22(8):1077–1092.
Bhalotia, G. and Hulgeri, A. (2002). Keyword searching
and browsing in databases using banks. In Proceed-
ings of the 18th International Conference on Data En-
gineering, ICDE ’02, pages 431–440.
Chaudhuri, S. and Ramakrishnan, R. (2005). Integrating
db and ir technologies: What is the sound of one
hand clapping. In Innovative Data Systems Research
(CIDR), pages 1–12.
Ding, B. and Xu, J. (2007). Finding top-k min-cost con-
nected trees in databases.
Dong, X. and Halevy, A. (2007). Indexing dataspaces. In
Proceedings of the 2007 ACM SIGMOD international
conference on Management of data, SIGMOD ’07,
pages 43–54. ACM.
Du, D. and Hu, X. (2008). Steiner Tree problems in Com-
puter Communication Networks. World Scientific
Publishing.
Fang, L. and Clement, Y. (2006). Effective keyword search
in relational databases. In Proceedings of the 2006
ACM SIGMOD international conference on Manage-
ment of data, SIGMOD ’06, pages 563–574. ACM.
Feng, J. and Li, G. (2011). Finding top-k answers in
keyword search over relational databases using tuple
units. IEEE Transactions on Knowledge and Data En-
gineering Volume, 23:1781–1794.
Franklin, M. and Halevy, A. (2005). From databases to
dataspaces: A new abstraction for information man-
agement. SIGMOD Record, 34:27–33.
He, H. and Wang, H. (2007). Blinks: ranked keyword
searches on graphs. In Proceedings of the 2007 ACM
SIGMOD international conference on Management of
data, SIGMOD ’07, pages 305–316. ACM.
Hristidis, V. and Gravano, L. (2003). Efficient ir-style key-
word search over relational databases. In Proceed-
ings of the 29th international conference on Very large
data bases - Volume 29, VLDB ’2003, pages 850–861.
VLDB Endowment.
Hristidis, V. and Papakonstantinou, Y. (2002). Discover:
Keyword search in relational databases. In Proceed-
ings of the 28th international conference on Very
Large Data Bases, pages 670–681. VLDB Endow-
ment.
Hristidis, V. and Papakonstantinou, Y. (2003). Keyword
proximity search on xml graphs. In Proceedings. 19th
International Conference Data Engineering, pages
367–378.
Kacholia, V. and Pandit, S. (2005). Bidirectional expansion
for keyword search on graph databases.
Kimelfeld, B. and Sagiv, Y. (2008). Efficiently enumerating
results of keyword search over data graphs. Informa-
tion Systems, 33:335–359.
Lam, C. (2011). Hadoop in Action. Manning Publications
Co.
Li, G. and Feng, J. (2008a). Ease: an effective 3-in-1 key-
word search method for unstructured, semi-structured
and structured data. In Proceedings of the 2008 ACM
SIGMOD international conference on Management of
data(SIGMOD), pages 903–914.
Li, G. and Feng, J. (2008b). Retrieving and materializ-
ing tuple units for effective keyword search over re-
lational databases. In Lecture Notes in Computer Sci-
ence, Conceptual Modeling - ER, pages 469–483.
Li, G. and Feng, J. (2009). Providing built-in keyword
search capabilities in rdbms.
Luo, L. and Lin, X. (2007). Spark: top-k keyword query in
relational databases. In Proceedings of the 2007 ACM
SIGMOD international conference on Management of
data, SIGMOD ’07, pages 115–126. ACM.
M. Karnstedt, K. S. (2008). A dht-based infrastructure for
ad-hoc integration and querying of semantic data. In
Proceedings of the 2008 international symposium on
Database engineering and applications, pages 19–28.
Park, J. and goo Lee, S. (2011). Keyword search in rela-
tional databases. Knowl. Inf. Syst, 26(2):175–193.
Su, Q. and Widom, J. (2005). Indexing relational database
content offline for efficient keyword-based search. In
AVirtualDocumentApproachforKeywordSearchinDatabases
47