evaluation comprised of citations from J.UCS to
J.UCS papers. We performed this experiment in the
month of May, 2008. There were 92 unique papers
that were cited by other J.UCS papers and numbers
of citations for these 92 papers were 151. Citeseer
indexed 67 papers (73%) out of 92 focused papers
and citations found by CiteSeer were only 38 (25%)
out of 151. It was due to citations that were in non-
compliant formats in the original papers. Our local
heuristics employed for J.UCS gave better results.
Our technique was able to disambiguate authors
by looking for author’s full name in the text of
paper. This approach also avoids the mistaken
identity of names of places as author of scientific
publications as discussed earlier.
Table 2 highlights the retrieval results for Links into
the Future that was determined from the Web for the
paper “Digital Libraries as Learning and Teaching
Support” published in J.UCS vol. 1 Issue 11.
When user performs a query on search engines,
he/she is returned with millions of hits as can be
seen in Table 1. The best formulated query for
finding PDF/PS/Doc documents was applied to
reduce the results to a few hundreds of documents.
The process of removing duplicates then reduced
the number of documents by up to 50%. Our
heuristic rules filtered 12 out of 75 as papers from
Google’s results and 19 out of 86 from Yahoo
respectively. The similarity based on key terms was
then further applied to select 17 out of 23.
Although a user has an option to explore citation
indexes to search for related papers. But there are
two issues 1) times when papers do not exist on
these citation indexes like the source paper in our
case study was not indexed by CiteSeer. While
Google Scholar indexes it but suggests hundreds of
related papers. 2) A deliberate effort is thus needed
to find related papers outside the user’s local
context.
6 CONCLUSIONS
We have in this paper described the extension of the
idea of links into the future to cover documents on
the Web. The results are promising in providing
candidates for future links. The key term similarity
detection has further filtered the most relevant
papers. As further works, we are currently
developing a tool based on sentiment analysis of
citations to evaluate the context of citations. We are
also further exploring the discovery of future related
papers from digital libraries like DBLP and
CiteSeer.
REFERENCES
About Google Scholar, http://scholar.google.at/intl/en/
scholar/about.html (accessed 23, May 2008).
Afzal, M.T. (2008). Citation Mining Technique for
creating Links into the Future. submitted to
International Journal on Digital Libraries.
Afzal, M. T., Abulaish, M. (2007). Ontological
Representation for links into the Future. ICCIT
Gyeongju-Korea, published by IEEE (CS).
Afzal. M. T., Kulathuramaiyer, N., & Maurer H. (2007).
Creating Links into the Future. Journal of Universal
Computer Science, vol. 13, issue 9.
Aleman-Meza, B., Decker, S.L., Cameron, D., & Arpinar,
I.B. (2007). Association Analytics for Network
Connectivity in a Bibliographic and Expertise Dataset.
book chapter in Semantic Web Engineering in the
Knowledge Society.
Chirita, Paul-A., Firan, C.S., & Nejdl, W. (2006). Pushing
Task Relevant Web Links down to the Desktop.
WIDM’06, November 10, 2006, Arlington, Virginia,
USA.
Dakka, W., Dayal, R., & Ipeirotis, P. (2006). Automatic
discovery of useful facet terms. ACM SIGIR
Workshop on Faceted Search.
Dakka, W., Ipeirotis, P. (2008). Automatic extraction of
useful facet hierarchies from text databases.
Proceedings of ICDE.
Emamy, K., Cameron, R. (2007). CiteULike: A
Researcher's Social Bookmarking Service. Ariadne,
Issue 51.
Giles, C.L., Bollacker, K.D., & Lawrence, S. (1998).
CiteSeer: An Automatic Citation Indexing System.
proceedings of Third ACM Conference on Digital
Libraries, pp. 89-98.
Jacsó, P. (2008). Reference Reviews. http://
www.gale.cengage.com/reference/peter/200708/Sprin
gerLink.htm (accessed 23, May 2008)
Krottmaier, H. (2003). Links to the Future. Journal of
Digital Information Management, Vol. 1, No. 1.
Maurer, H. (2001). Beyond Digital Libraries, Global
Digtial Library Development in the New Millenium.
Proceedings NIT Conference, 165-173.
Postellon, D. C. (2008). Hall and Keynes join Arbor in the
citation indices. Nature, 452, 282.
Price, G. (2004). Google Scholar Documentation and
Large PDF Files, http://blog.searchenginewatch.
com/blog/041201-105511 (accessed 23, May 2008).
PubMed, http://www.ncbi.nih.gov/entrez/query.fcgi
Medical Subject Headings (MeSH), http://
www.nlm.nih.gov/mesh/
Ratprasartporn, K., Ozsoyoglu, G. (2007). Finding Related
Papers in Literature Digital Libraries. ECDL, LNCS
4675, pp. 271–284.
Rhodes, B.J., Maes, P. (2000). Just-in-time information
retrieval agents. IBM Syst. J., 39(3-4):685–704.
Speretta, M., Gauch, S. (2005). Personilized Search Based
on User Search Histories. IEEE/WIC/ACM
International Conference on Web Intelligence.
WEBIST 2009 - 5th International Conference on Web Information Systems and Technologies
128