loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Julián Grigera 1 ; 2 ; 3 ; Juan Cruz Gardey 2 ; 3 ; Alejandra Garrido 2 ; 3 and Gustavo Rossi 2 ; 3

Affiliations: 1 CICPBA, Argentina ; 2 CONICET, Argentina ; 3 LIFIA, Facultad de Informática, Universidad Nacional de La Plata, La Plata, CP 1900, Argentina

Keyword(s): Information Extraction, Web Adaptation, Refactoring for Usability.

Abstract: Most documents in the WWW are generated from templates that represent user interface (UI) elements, and later filled with contents. In the field of information extraction, many approaches emerged to analyze the documents’ structure, obtain similar features amongst them, and generate wrappers that are used to extract the raw contents from such documents. Therefore, most techniques documented in the literature are optimized to compare full documents, but there are other fields of applicability that require analyzing structural similarity on smaller UI components, like web augmentation or transcoding. In this paper we present two flexible algorithms to measure similarity between DOM Elements by using a mixed approach that considers both elements’ location and inner structure. The proposed algorithms were used in the context of two projects: an approach for automatic usability refactoring, and a web accessibility helper. We also present a wrapper induction technique based on such algorit hms. Additionally, we present a precision & recall evaluation of our algorithms as compared with other known approaches, applied to DOM elements of different sizes, but smaller than full scaled documents. The proposed algorithms run in linear time, so they are faster than most approaches that analyze structural similarity. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.221.27.56

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Grigera, J.; Gardey, J.; Garrido, A. and Rossi, G. (2021). A Scoring Map Algorithm for Automatically Detecting Structural Similarity of DOM Elements. In Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST; ISBN 978-989-758-536-4; ISSN 2184-3252, SciTePress, pages 174-185. DOI: 10.5220/0010716300003058

@conference{webist21,
author={Julián Grigera. and Juan Cruz Gardey. and Alejandra Garrido. and Gustavo Rossi.},
title={A Scoring Map Algorithm for Automatically Detecting Structural Similarity of DOM Elements},
booktitle={Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST},
year={2021},
pages={174-185},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010716300003058},
isbn={978-989-758-536-4},
issn={2184-3252},
}

TY - CONF

JO - Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST
TI - A Scoring Map Algorithm for Automatically Detecting Structural Similarity of DOM Elements
SN - 978-989-758-536-4
IS - 2184-3252
AU - Grigera, J.
AU - Gardey, J.
AU - Garrido, A.
AU - Rossi, G.
PY - 2021
SP - 174
EP - 185
DO - 10.5220/0010716300003058
PB - SciTePress