loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Jun Wang 1 and Kanji Uchino 2

Affiliations: 1 Fujitsu R&D Center Co., Ltd., China ; 2 Fujitsu Laboratories, Ltd., Japan

Keyword(s): RSS, Metadata, Information Extraction, Knowledge Management

Related Ontology Subjects/Areas/Topics: Biomedical Engineering ; Data Engineering ; Enterprise Information Systems ; Health Information Systems ; Information Systems Analysis and Specification ; Internet Technology ; Knowledge Management ; Metadata and Metamodeling ; Ontologies and the Semantic Web ; Society, e-Business and e-Government ; Web Information Systems and Technologies ; Web Interfaces and Applications ; Web Personalization ; XML and Data Management

Abstract: Although RSS demonstrates a promising solution to track and personalize the flow of new Web information, many of the current Web sites are not yet enabled with RSS feeds. The availability of convenient approaches to “RSSify” existing suitable Web contents has become a stringent necessity. This paper presents EHTML2RSS, an efficient system that translates semi-structured HTML pages to structured RSS feeds, which proposes different approaches based on various features of HTML pages. For the information items with release time, the system provides an automatic approach based on time pattern discovery. Another automatic approach based on repeated tag pattern discovery is applied to convert the regular pages without the time pattern. A semi-automatic approach based on labelling is available to process the irregular pages or specific sections in Web pages according to the user’s requirements. Experimental results show that our system is efficient and effective in facilitating the RSS feed generation. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 100.26.1.130

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Wang, J. and Uchino, K. (2005). EFFICIENT RSS FEED GENERATION FROM HTML PAGES. In Proceedings of the First International Conference on Web Information Systems and Technologies - WEBIST; ISBN 972-8865-20-1; ISSN 2184-3252, SciTePress, pages 311-318. DOI: 10.5220/0001230103110318

@conference{webist05,
author={Jun Wang. and Kanji Uchino.},
title={EFFICIENT RSS FEED GENERATION FROM HTML PAGES},
booktitle={Proceedings of the First International Conference on Web Information Systems and Technologies - WEBIST},
year={2005},
pages={311-318},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001230103110318},
isbn={972-8865-20-1},
issn={2184-3252},
}

TY - CONF

JO - Proceedings of the First International Conference on Web Information Systems and Technologies - WEBIST
TI - EFFICIENT RSS FEED GENERATION FROM HTML PAGES
SN - 972-8865-20-1
IS - 2184-3252
AU - Wang, J.
AU - Uchino, K.
PY - 2005
SP - 311
EP - 318
DO - 10.5220/0001230103110318
PB - SciTePress