Figure 2: A comparison between RSS and Atom formats.
2.2.1 RSS 1.0, RSS 2.0 and ATOM 1
Despite having the same acronym, RSS 1.0 and RSS
2.0 are distinct and incompatible formats. RSS 1.0
stands for RDF Site Summary and incorporates the
Resource Description Framework (RDF,
http://www.w3.org/RDF/) and its tags and attributes
to better describe resources. The basic structure of
RSS 1.0 involves wrapping the entire feed in the
<rdf:RDF> element which contains the
definition, attributes and list of items of a
<channel> (the source of information) and each
item and its attributes specifically described in the
<item>. Specification flexibility allows the use of
metadata to attach information to the feed by
integrating other standards (i.e. the Dublin Core,
http://dublincore.org/) useful for semantic
processing, even if they are a bit verbose. RSS 2.0,
which follows on from various RSS 0.9x
specifications, was developed by Netscape and later
by Useland. It stands for Really Simple Syndication
to emphasize its ease of use. According to this
format, the feed is described inside the <rss> tag
and includes a <channel> metadata with a set of
attributes (which contain more information than in
the previous format) and then the list of items and
their attributes (i.e. standard as link, title and
description metadata and other facilities like
enclosure which allows attachments to be
automatically downloaded, or a <guid> element
that identifies the item uniquely). Finally Atom, as
defined by IETF in the last 1.0 version, is a standard
which defines both a feed representation format (the
Atom Syndication Format, RFC 4287,
http://www.ietf.org/rfc/rfc4287.txt) and an
interaction protocol (the Atom Syndication Format
an internet drafts, http://www.ietf.org/internet-
drafts/draft-ietf-atompub-protocol-17.txt) with
enhanced interoperability. In the Atom format, the
feed is specified by the <feed> metadata that
initially describes the channel (even if it does not
associate it with a specific tag) and its attributes and
then specifies each item inside the <entry> tag.
Most client feed applications deal with each format.
A web application which creates syntactically
corrected and validated feeds following the different
formats, may however guarantee a spread
information delivering.
2.3 Feed Processing
Despite having different standards, feed formats are
XML files and may be managed and processed by
many libraries and tools developed using different
programming languages (i.e. PHP MagPie RSS,
http://magpierss.sourceforge.net/, the Java ROME
https://rome.dev.java.net/, or Python RSS.py
(http://www.mnot.net/python/RSS.py). Many are
distributed as on-line tools (for example, a lot of
scraping tools are used as web aggregators like
xpath2rss, http://freshmeat.net/projects/xpath2rss/)
despite the fact they do not provide a packaged
solution to be delivered to each website. However,
the common underlying concept is the extraction of
information, its formatting according to XML syntax
and the processing and parsing of the visualization
inside a Web page or other application. Focusing on
the visualization inside a Web page, the simplest
way is to include an external feed by pointing to a
RSS parser developed with every language which
processes it and then presents the content according
a specific style through CSS technologies (Schmitt,
C., 2006). The choice of the language and the
platform is subjective.
3 THE DEVELOPED SOLUTION
After the analysis of constraints and issues, the
developed solution has followed the rule of easy
implementation and requires the design of the
content database, the development of a Web
application for producing the feeds and the
establishment of simple visualization procedure as
means of a set of scripts and CSS templates. The
LAMP platform has been chosen for the first two
phases, while other more interactive technologies,
collectively known as the Ajax paradigm (Gross, C.,
2006), have been adopted for the visualization
phase. The MySQL schema design requires more
work to include all the attributes related to each feed
specification needed for successive validation. The
WEBIST 2008 - International Conference on Web Information Systems and Technologies
230