which the query is entered (the figure of this is omit-
ted due to brevity). One can enter a new query, or
double click an old one to make the search again. As
soon as the query is entered and the search engine(s)
respond(s), the result lines and nodes start appearing
into the List View and Graph View, respectively. An
asterisk at the beginning of a result line indicates that
also the full text has been retrieved.
Meanwhile, the clustering algorithm creates new
categories that start to appear into the Category View.
In the early phase, the number of categories and their
names vary heavily while new full texts are retrieved.
If the task is to have an overview, some 5% thresh-
old would be the most feasible as in Figure 2. Thus,
fewer categories are created and it is easy to filter out
unnecessary results.
By double clicking a category, the List View
shows the corresponding results. Due to the low
threshold value, not all category names are related to
the results. In addition, some categories consist of
empty pages that can be filtered out. After this, the
threshold can be raised (e.g., to 40% as in Figure 5)
in order to focus on single documents. For example,
in order to pick up new expressions that better match
to the query. By lowering and raising the threshold,
it is quite feasible to refine the query and finally pick
up only those documents best matching to the certain
topic of interest.
The Graph View gives additional information
about the documents. Basically, it illustrates how the
documents are related to each other. In Figure 3, for
example, course home pages and text books seem to
have a strong relationship.
6 DISCUSSION
We have demonstrated a way to visualize search
results queried from standard Internet search en-
gines. The implemented system VisElabor dynami-
cally clusters search results and shows the informa-
tion in several views simultaneously. The idea is to
coordinate the information among the views based on
user actions in such a way that the refinement of the
query is easy and straightforward. The aim is to speed
up the cyclic search process by providing better ways
to cope with the large number of documents typically
found in a single query.
VisElabor is running on the client machine. It
sends the search request to a search engine and down-
loads the resulting documents to be clustered dynami-
cally. The results are shown in four separate windows.
The List View is fast to compile as it contains only the
search results retrieved from the search engine.
The Category View is based on the clustering of
the search results. Our clustering algorithm is dy-
namic, i.e. the user sees the evolving stage of cluster-
ing all the time, yet it is possible to interact with the
system at any moment. For example, the first cluster
is ready to be examined as soon as it appears on the
display. The threshold value for new clusters provides
the user with a possibility to work on different views
on the same data.
The Graph View requires that the system is able
to coordinate the information in multiple views and si-
multaneously regain contextual information in case of
real-time updates, i.e., while information constantly
flows in from the search engine. The user should be
able to follow the changes in order to maintain the
context. One way to solve this is to allow updates only
by request. The TouchGraph library we use was de-
signed to be applied in real-time applications. In ad-
dition, the movements of nodes were animated which
is necessary in order to retain the context. Yet, this
requires quite a powerful computer.
Finally, a flexible Full Text View requires a
browser. Fortunately, a simple browser window can
be achieved by utilizing standard libraries. How-
ever, in a real application, more sophisticated browser
would be required.
The most salient point here is the coordination
among the multiple views. Thus, the user can have
better overview of the search results, but he can still
maintain the context due to the automatic coordi-
nation. Even in a dynamic situation in which the
search results are constantly updated as they arrive,
and while the user itself interacts with the results (e.g.,
filters out results, or views a single document).
There are several options to apply VisElabor as
well as develop it further. First, even though our
examples emphasize the client side usage, the sys-
tem can easily be transformed into a server side ver-
sion that has a light browser front end. Second, ad-
ditional visualizations could be implemented so that
more users can find a visual interface that appeals
them. Third, VisElabor could be enhanced with adap-
tive personalized information in order to get more rel-
evant search results. This kind of user model consists
of preferences given directly by the user and of an
adaptive part which is updated after each search re-
quest. Maintaining the model locally without sending
personal information anywhere else is an obvious ad-
vantage. The additional information may be applied
to enhancing the search request or to favoring relevant
categories in clustering.
WEBIST 2007 - International Conference on Web Information Systems and Technologies
268