were generated. One such query is already illus-
trated. Some of the other queries were: “”projects
X(Retrieve all projects that X is working), “author
publications data mining”(publication details related
to data mining), “journal 2006 publications”(journal
publications in 2006) and “topic webservice publi-
cations”(publications dealing with topic webservice).
The approach seems to be promising as the answer
graphs constructed for these queries provided the
closest assembly of nodes and relationships to the
keywords submitted. On the negative side, the algo-
rithm proposed fails for the following keyword query
“AIFB journal”. In this example which corresponds
to an extreme case, Organisation and Project nodes
get added to component structure of AIFB and Article
and Publication gets added to component structure of
journal. The pruning step does not identify any si-
miliar nodes. The reason for the failure is that the
keywords are far apart and hence pruning step fails to
identify similiar nodes for hooking to happen. Prun-
ing step will be suitably extended to handle these class
of examples.
In our illustrations we have shown one answer
graph that gets constructed out of the exploration
phase. But given keywords multiple interpretations
are possible and this leads to multiple answer graphs.
For e.g given the keyword list {X-Media, Philip, Pub-
lications} the possible interpretations are
• publications by Philip in Project X-Media;
• publications by Philip with X-Media in title;
• publications by Philip with X-Media in abstract.
During the term mapping phase X-Media will be
mapped on to name node , title node and abstract
node. This leads to three answer graphs for the same
set of keywords. There is a need to rank these graphs.
We are working on a ranking metric which will be a
function of the strength of the individual cluster and
also on the strength of the hook between clusters.
6 RELATED WORK
In this paper we have addressed the issue of an-
swer graph construction for keyword queries on graph
structured data through a concrete algorithm for graph
exploration. We have tried to improve the graph ex-
ploration through an alternative approach by adopt-
ing pruning and hooking as compared to (Zhou et al.,
2007; Tran et al., 2007).
Keyword search on structured data has been ex-
tensively investigated in recent years under different
contexts. Earlier approaches (Bhalotia et al., 2002;
He et al., 2007; Kacholia et al., 2005) tried to address
keyword search in the context of relational databases.
Exact matches between keywords and labels of data
elements were done. Also substructures in the form
of trees were constructed and the root element is as-
sumed to be the answer. (Bhalotia et al., 2002) uses
backward search algorithm. In order to improve the
search by limiting the nodes to be visited, (Kacho-
lia et al., 2005) proposed bi-directional search algo-
rithm where the exploration is through both backward
and forward edge. The idea is to reach the root ele-
ment faster through this approach. (He et al., 2007)
also adopts distinct root semantics but improves the
efficiency of the search using partitioning, balanced
cost strategy and indexing to support forward jumps.
These methods however do not exploit the schema
knowledge for processing queries.
(Revuri et al., 2006) presents a system for key-
word search that fits the query terms in an appropriate
way from the ontology graph and derives an enhanced
query. This query is given to the basic keyword search
engine and results obtained are ranked. The system
adopts a template based approach where it fixes the
structure and then enhances the terms. Also the key-
words are restricted to two terms. (Tran et al., 2007)
presents a generic graph based approach to explore
the connections between terms mapped to keywords
of the query using knowledge available in ontolo-
gies. A three step process consisting of term mapping,
connection exploration and DL query construction is
used. The exploration is restricted to connections
where an instance is related to a concept by an is-a
relation and two instances are related by object and
data properties.The exploration builds a graph con-
necting a term element with all its neighbours within
a specified range d. The process of exploration re-
lies mainly on assertional knowledge resulting in a
large number of paths that need to be processed. The
graph does not model class/sub-class forms of rela-
tionship. (Zhou et al., 2007) also adopts three step
process: term mapping, query graph construction and
query ranking. For each grouping of terms different
query sets are constructed by enumerating all possi-
ble combinations from different sense of terms. From
each query set a query graph is derived. A probabilis-
tic ranking model is adopted for ranking the query
graphs. In this system also the knowledge features
and pruning mechanisms are not exploited during the
exploration phase.
In our approach we have adopted a different strat-
egy for the exploration phase. Unlike above we are
not considering all the nodes for exploration.We cre-
ate a fragment of closely related concept and re-
lationship cluster and then prune unwanted nodes
and edges. We also adopt a guided exploration
KDIR 2010 - International Conference on Knowledge Discovery and Information Retrieval
166