the details and reports that the system produces are
to be presented in the main window of the web
browser as a normal web page.
Since the most common browser is today MS
Internet Explorer, it had been selected for the user
interface layer. Hence, integrating the system user
interface directly to the MS Internet Explorer
window required the use of the COM (Component
Object Model) technology. However, the similar
user interface could be offered to Mozilla Firefox
users as well. This, however, was not covered in the
project.
The remote communication mainly between the
user component and the user profile component
required including the asynchronous processing of
requests and the display of the results. For this
reason the user component was developed as a
multithreaded application. The coordination of
individual requests as threads was provided by one
main thread that communicates with user interface
(which is usually not thread safe).
2.2 Search Component
Search component’s role is to query the search
engines and obtain result. Typically most of the
major search engines provide web services or other
interfaces to programmatically use the searching
capabilities. The search component waits for the
command of a user component to start the search.
The results from the search engine can then be
refined by computing the similarity and filtering out
those resources that do not correspond with the
preferences in the user profile. Before the similarity
can be computed the resources are processed in the
collect component.
2.3 Collect Component
Collect component is responsible for processing the
resources and for storing the resource information.
Typically the collect component obtains resource
from the search component or directly from the user
component, in case user gets the resource in other
ways than searching. The resources can also be
obtained from the digital libraries and, if appropriate
and the communication interfaces are specified, the
collect component can also supply some of the
resources into the digital library.
The obtained resource is first checked on the
type so that the particular parsing and extracting
engine can be used. Currently, only the resources in
the PDF (Portable Document Format) are supported.
The extension to Microsoft Word Documents and
the resources in XML or HTML pose only a minor
problem.
If the resources are parsed successfully, further
information is attempted to be extracted. The
information being extracted concerns title, authors,
keywords and an abstract. Optionally the publisher
and other information used for citations and
referencing can be included in the system. If
extraction does not succeed then it can be filled
manually using the user component interface in the
internet browser. The resource then is processed
against the terms identified and the normalized
frequencies are computed.
2.4 Recommendation Component
The recommendation component is the core of the
system. It has the role of the server providing
services for other components namely user
component. The main task of the recommendation
component is twofold: First, to obtain the necessary
resource information about a resource and rating
information of a particular user and store such data
in the database; Second, to infer the rank of searched
documents according to the preferences of users.
If request for recommending possibly interesting
resources is sent, then the recommendation
component computes the similarity between the user
preferences stored in the user profile and the
resources stored in the database or obtained from the
search engine. In this way the recommendation is
based on the content of the resources and the user
preferences i.e. content filtering (Herlocker 2004).
The weight of the recommended resources is given
by the similarity function.
The recommendation can also be based on the
computation of similarities between user preferences
in which the users with similar user preferences are
determined and the system then recommends the
resources that similar users have rated as interesting
i.e. collaborative filtering (Herlocker 2004).
2.5 User Interface
As stated above, the system was built with usability
in mind. The figure 2 shows window of the MS
Internet Explorer with recommendation system user
interface. First, the toolbar provides means for
documents rating. The rating can be done explicitly
or user can enable the implicit rating based on the
time spent on a page and other patterns (Herlocker
2004). The bar on a left hand side then serves for
recommendation or assisted search. The list of
recommended resources is displayed as hyperlinks
so that they can be used instantly.
RECOMMENDATION OF WEB RESOURCES FOR ACADEMICS - Architecture and Components
439