time, the system is trained to draw inferences from the user about the desired meaning
behind user contexts. Since this dynamic information develops on the client machine,
these contexts allow the user to conduct more meaningful searches with less effort put
into search phrase generation.
To implement this model, the Context Manager stores context categorizations,
definition keywords, and search history in a relational database. The relational
schema is composed of six 3NF tables, designed by the standard top-down entity-
relationship model to relational schema mapping algorithm:
CONTEXT (name
, context_number, created)
DEFINITION (context_number, keyword
, weight, initial_weight)
SEARCH (context_number, search_string
, search_number, last_performed)
RESULT_LOOKUP (search_number, url
, result_number)
RESULT_METADATA (result_number, search_engine
, title, snippet)
RESULT_USERDATA (result_number, search_query
, relevance, ranking,
impression, last_visited)
At the highest level, context entries are unique by name. Members of the defini-
tion table are associated with a particular context, thus definitions are distinguished
by the concatenated key “context_number, keyword
.” Likewise, any historic entry in
the search table is related to a specific context and is ordered by the key “con-
text_number, search_string.” Under this table, the “search_number” provides the key
for details stored in the result_lookup relation, which contains the key for the more
specific result_metadata and result_userdata tables. This model allows the DBMS
to manage search strings, URLs, and meta-data which may consist of many long
strings. By design, the fundamental engine behind WHAT is not reliant on this spe-
cific data storage schema to allow for scalability.
3 Query Constructor and Search Engine Requests
Once the user enters a search phrase and a context is selected, queries are sent to stan-
dard search engines (Google, Yahoo, etc.). To take advantage of the context informa-
tion, the Query Constructor generates and formats multiple search strings for each
search engine. Initially, the constructor begins by forming a set of all possible com-
binations of the keywords for the selected context. The search phrase is then ap-
pended to each generated string. However, the number of keyword combinations
grows exponentially in relation to the number of keywords, so the constructor selects
a reasonable subset of the generated strings for submission. This subset is determined
by a user-modifiable parameter called “tolerance.” Tolerance is a percentage which
is involved in a calculation to filter out less relevant queries. Based on the associated
weights of the keywords, each string is compared against the most relevant string
generated, and those strings that are significantly irrelevant according to the tolerance
factor are removed from the submission set. Thus, the behavior of the Query Con-
structor is dependent on the client-side personalization of the Context Manager and
will improve with continued information gathering.
36