loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Author: Ari Pirkola

Affiliation: University of Tampere, Finland

ISBN: 978-989-8565-08-2

Keyword(s): Dictionaries, Focused Crawling, Query Performance Prediction, Searching, Vertical Search Engines, Web Search Engines.

Related Ontology Subjects/Areas/Topics: Searching and Browsing ; Web Information Systems and Technologies ; Web Interfaces and Applications

Abstract: The contributions of this paper are twofold. First, we present a new type of dictionary that is intended as a search assistance in topic-specific Web searching. The method to construct the dictionary is a general method that can be applied to any reasonable topic. The first implementation deals with climate change. The dictionary has the following new features compared to standard dictionaries and thesauri: (A) It contains real-text phrases (e.g. rising sea levels) in addition to the standard dictionary forms (sea-level rise). The phrases were extracted automatically from the pages dealing with climate change, and are thus known to appear in the pages discussing climate change issues when used as search terms. (B) Synonyms, i.e., different spelling, syntactic, and short form variants of the phrase are grouped together into the same entry (synonym set) using approximate string matching. (C) Each phrase is assigned an importance score (IS) which is calculated based on the frequencies of the phrase in relevant pages (i.e., pages on climate change) and non-relevant pages. Second, we investigate how effective the IS is for indicating the best phrase among synonymous phrases and for indicating effective phrases in general from the viewpoint of search results. The experimental results showed that the best phrases have higher ISs than the other phrases of a synonym set, and that the higher the IS is the better the search results are. This paper also describes the crawler used to fetch the source data for the climate change dictionary and discusses the benefits of using the dictionary in Web searching. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.233.219.101

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Pirkola, A. (2012). TOPIC-SPECIFIC WEB SEARCHING BASED ON A REAL-TEXT DICTIONARY.In Proceedings of the 8th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8565-08-2, pages 289-298. DOI: 10.5220/0003895602890298

@conference{webist12,
author={Ari Pirkola.},
title={TOPIC-SPECIFIC WEB SEARCHING BASED ON A REAL-TEXT DICTIONARY},
booktitle={Proceedings of the 8th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2012},
pages={289-298},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003895602890298},
isbn={978-989-8565-08-2},
}

TY - CONF

JO - Proceedings of the 8th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - TOPIC-SPECIFIC WEB SEARCHING BASED ON A REAL-TEXT DICTIONARY
SN - 978-989-8565-08-2
AU - Pirkola, A.
PY - 2012
SP - 289
EP - 298
DO - 10.5220/0003895602890298

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.