results. The process of creating a set of Web pages
that represents each of the genres is the next step in
the research. The process will result in clusters (sets)
of Web pages that resemble the content, the
structure, and the functionality of the corresponding
genres. Each set will go through several refinements
before it will be considered as a genre
representative. These refinements will be aimed at
minimizing the time required for achieving the
classification and clustering processes during query
time. In addition, the refinements will aim at
providing satisfactory levels of accuracy in the
classification.
After selecting Web genre representatives, the
research will aim at conducting a user study in
which the accuracy of genre-based classification will
be further investigated. The user engagement with
genre-based clustering as well as the effectiveness of
this approach will be investigated in the study. The
study will show the extent to which users will be
satisfied with genre-based clustering compared to
topical clustering and row presentations of Web
search results. Further research may be aimed at
more profound analysis of Web page genres to
include other subgenres.
5 CONCLUSIONS
Taking into consideration that Web genres may yield
more effective classification of Web documents
(Rosso, 2005), this research aims at investigating the
feasibility of classifying Web search results by
genres. The ultimate goal is to provide more
effective search results to the user. The remaining
stages of the research will involve creating Web
genre representatives of Web pages for the purpose
of classification. In addition, the clustering of Web
search results by genres will be investigated in a user
study that compares genre-based clustering to
topical clustering.
REFERENCES
Alhenshiri, A., Brooks, S., Watters, C., Shepherd, M.,
2010. Augmenting the Visual Presentation of Web
Search Results. In proceedings of the 5
th
International
Conference on Digital Information Management,
Thunder Bay, ON, Canada, (to appear).
Carpineto, C., Osiński, S., Romano, G., Weiss, D., 2009. A
Survey of Web Clustering Engines. ACM Computing
Surveys, vol. 41, issue 3, Article No. 17.
Levering, R., Cutler, M., and Yu, L., 2008. Using Visual
Features for Fine-Grained Genre Classification of Web
Pages. In Proceedings of the 41st Annual Hawaii
International Conference on System Sciences, Hawaii,
USA, 131.
Manning, C. D., Raghavan, P., Schütze, H., 2008.
Introduction to Information Retrieval. Cambridge
University Press.
Mason, J., E., Shepherd, M., Duffy, J., 2009. An N-Gram
Based Approach to Automatically Identifying Web
Page Genre. HICSS 2009: 1-10.
Rosso, A. M., 2005. What type of page is this?: Genre as
Web Descriptor. In Proceedings of the 5th
ACM/IEEE-CS Joint Conference on Digital Libraries,
Denver, CO, USA, 398.
Stubbe, A., Ringlstetter, C., Zheng, T., Goebel, R., 2007.
Incremental Genre Classification. In Proceeding of
Colloquium held in conjunction with Corpus
Linguistics, Birmingham, UK.
Santini, M., 2006. Interpreting Genre Evolution on the
Web. In EACL 2006 Workshop: NEW TEXT - Wikis
and blogs and other dynamic text sources, Trento, 32-
40.
Santini, M., Sharoff, S., 2009. Web Genre Benchmark
Under Construction. Journal for Language
Technology and Computational Linguistics (JLCL).
Volume 25, Number 1- Special Issue: Automatic
Genre Identification: Issues, and Prospects.
Teevan, J. 2008. How People Recall, Recognize, and
Reuse Search Results. ACM Transactions on
Information Systems, vol. 26, issue 4. Article No. 19.
Turetken, O., & Sharda, R., 2005. Clustering-based Visual
Interfaces for Presentation of Web Search Results: An
Imperical Investigation. Information Systems Frontier,
7(3), 273-297.
ONLINE WEB GENRE CLASSIFICATION, IS IT DOABLE?
281