loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Balázs Pintér ; Gyula Vörös ; Zoltán Szabó and András Lőrincz

Affiliation: Eötvös Loránd University, Hungary

Keyword(s): Unintelligible Words, Wikification, Link Disambiguation, Natural Language Processing, Structured Sparse Coding.

Related Ontology Subjects/Areas/Topics: Applications ; Artificial Intelligence ; Knowledge Engineering and Ontology Development ; Knowledge-Based Systems ; Natural Language Processing ; Pattern Recognition ; Sparsity ; Symbolic Systems ; Theory and Methods

Abstract: Explaining unintelligible words is a practical problem for text obtained by optical character recognition, from the Web (e.g., because of misspellings), etc. Approaches to wikification, to enriching text by linking words to Wikipedia articles, could help solve this problem. However, existing methods for wikification assume that the text is correct, so they are not capable of wikifying erroneous text. Because of errors, the problem of disambiguation (identifying the appropriate article to link to) becomes large-scale: as the word to be disambiguated is unknown, the article to link to has to be selected from among hundreds, maybe thousands of candidate articles. Existing approaches for the case where the word is known build upon the distributional hypothesis: words that occur in the same contexts tend to have similar meanings. The increased number of candidate articles makes the difficulty of spuriously similar contexts (when two contexts are similar but belong to different articles) m ore severe. We propose a method to overcome this difficulty by combining the distributional hypothesis with structured sparsity, a rapidly expanding area of research. Empirically, our approach based on structured sparsity compares favorably to various traditional classification methods. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.141

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Pintér, B., Vörös, G., Szabó, Z. and Lőrincz, A. (2013). Explaining Unintelligible Words by Means of their Context. In Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-8565-41-9; ISSN 2184-4313, SciTePress, pages 382-387. DOI: 10.5220/0004267003820387

@conference{icpram13,
author={Balázs Pintér and Gyula Vörös and Zoltán Szabó and András Lőrincz},
title={Explaining Unintelligible Words by Means of their Context},
booktitle={Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2013},
pages={382-387},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004267003820387},
isbn={978-989-8565-41-9},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - Explaining Unintelligible Words by Means of their Context
SN - 978-989-8565-41-9
IS - 2184-4313
AU - Pintér, B.
AU - Vörös, G.
AU - Szabó, Z.
AU - Lőrincz, A.
PY - 2013
SP - 382
EP - 387
DO - 10.5220/0004267003820387
PB - SciTePress