preferences and produce news from the users
commonly requested locations. The browser was
able to work autonomously in feeding news to users
on the basis of the location and news category of the
user’s past browsing profile. It seems clear that
adding location dimensions to the browsing history
of each user is likely to produce improvements to
news browser performance.
5 CONCLUSIONS
Personalisation has become an urgent need because
users need to manage the massive data explosion in
all information-based systems, particularly in web
applications. Therefore, websites have started to
offer personalisation services for their users,
particularly in online news providing systems. In
order to be efficient, a personalisation system needs
to achieve four features: autonomy, adaptation to
changes in users' behaviour, acceptable performance,
and satisfactory matching to user requirements.
ANP is a prototype system designed to provide
on-line-personalised news meeting the key features
of personalisation outlined earlier, without affecting
retrieval performance. The prototype provides a
systematic method for managing personalisation by
using web usage mining .
The results of implementing the prototype can be
summarised in the followings:
¾ The system was able to connect to web log
files and transform delimited values into a
table of columns and rows.
¾ Logs raw data was successfully cleaned from
noise in an intelligent way, with relatively
noncomplex transformations. Non-required
columns were not selected, where unrelated
rows such as file headings, image, and
unsuccessful records were filtered using
several transformations.
¾ Users were identified by their IP addresses
and browsing time was divided into sessions
using certain transformations.
¾ After the data was preprocessed, it was
summarised/aggregated according to user IP,
news category and location, and session.
¾ The Microsoft clustering algorithm was
applied successfully on the aggregated data,
and resulted in a set of clusters. The
clustering was efficient, and with the
capabilities provided by SQL Server 2005,
the results of clustering were refined further.
The developed prototype worked autonomously
in performing the main system tasks, but not in all;
because the system was not applied in a live scenario
and there are still several issues to be addressed
before this can be done. Furthermore, adaptation
needs lots of log files and other resources in order to
be implemented in a real context and this has been
outside the scope of the immediate project.
REFERENCES
Ardissono, L., et al., (2000b). Strategies for personalizing
the access to news servers. [online] Stanford: AAAI
Spring Symposium. Available from <
www.di.unito.it/~liliana/ EC/aui00Giornale.ps.gz >
[Accessed 12 August 2007].
Batista, P., and Silva, M. J., (2002). Mining Web Access
Logs of an On-line. Malaga, Spain. 29-31 May 2002.
eCTRL, 2002.
Castellano, G., et al., (2007). Log data preparation for
mining web usage patterns. Proc. IADIS International
Conference, Salamanca, Spain, 18-20 February 2007,
Italy: University of Bari, 2007, 371-378.
Google, (2007). Google News. [online] Available from:
<news.google.com> [Accessed 19 November 2007].
Grcar, M. (2004). User Profiling: Web Usage Mining.
Proc. The 7th International Multiconference
Information Society IS, Ljubljana, Slovenia, 11-15
October 2004, IOS Press: Netherlands, 2004, 179-183.
Paliouras G., et al., (2006) PNS: Personalized Multi-
source News Delivery. U.K., 9-11 October 2006.
U.K.: Springer, 2006, 1152 – 1161.
Singh, S., et al., (2006). An Adaptive User Profile for
Filtering News Based on a User Interest Hierarchy. In:
Grove, Andrew, Eds. Proceedings 69th Annual
Meeting of the American Society for Information
Science and Technology (ASIST), Austin (US), 3-8
November 2006, 43, USA: Richard B. Hill, 2007.
Yahoo, (2007). Yahoo News. [online] Available from:
<news.yahoo.com> [Accessed 20 November 2007].
Yang, Z., et al., (2006). An Effective System for Mining
Web Log. Proc. of 8th Asia-Pacific Web Conference
(APWeb'06), Harbin, China, 16-18January 2006, 40-
52.
AUTONOMOUS NEWS PERSONALISATION (ANP)
267