2014). Datasets are described by general metadata,
e.g. publication date, keywords, language (Maali
et al., 2014), and other information that makes it eas-
ier to discover and use dataset.
A single dataset may be available in different Dis-
tributions to meet interoperability and usage require-
ments of different users. These distributions might
represent different formats of the dataset or differ-
ent endpoints (Maali et al., 2014). Examples of dis-
tributions include a downloadable CSV file, an API
or an RSS feed. Distributions are also described
by metadata. The format property (dct:format), for
example, specifies data format used in the distribu-
tion. In order to ensure interoperability, the format
should be a standard MIME type, like text/csv or
application/json. The accessURL (dcat:accessURL)
property determines a resource that gives access to
the distribution of dataset, like a landingpage or
a SPARQL endpoint. The downloadURL property
(dcat:downloadURL) represents a file that contains
the distribution of the dataset in a given format. Other
relevant metadata about a distribution include the size
of a distribution in bytes (dcat:byteSize) and the li-
cense (dct:license) under which the distribution is
made available.
Finally, a Catalog is defined as a curated collec-
tion of metadata about datasets. Catalogs are also de-
scribed by several properties, including: date of pub-
lication, date of modification, language used in the
textual metadata describe datasets and the entity re-
sponsible for online publication of the catalog (Maali
et al., 2014). A record in a data catalog, describing
a single dataset, is represented as an instance of the
CatalogRecord class.
In our work, besides concepts offered by DCAT,
we also use other vocabulary proposals to model feed-
back collected from data consumers about specific
datasets. In general, data cataloguing solutions don’t
offer means to collect and to share feedback about
datasets. However, gathering such information pro-
duces benefits for both producers and consumers of
data, contributing to improve the quality of the pub-
lished data, as well as to encourage the publication of
new data.
The feedback data is modeled using two vocab-
ularies have been developed by the W3C Working
Group: Dataset Usage Vocabulary
5
and Web Annota-
tion Data Model
6
. The former one aims to define a vo-
cabulary that offers concepts to describe how datasets
published on the Web have been used as well con-
cepts to capture data consumer’s feedback and dataset
citations. The later describes a structured model and
5
http://www.w3.org/TR/vocab-duv/
6
http://www.w3.org/TR/annotation-model/
format to enable annotations to be shared and reused
across different platforms.
Figure 2 presents the main classes used to model
feedback. duv:Usagefeedback allows to capture con-
sumer’s feedback in the form of annotations. As de-
scribed in the Web Annotation Data Model(Sanderson
et al., 2015), an Annotation is a rooted, directed graph
that represents a relationship between resources,
whose primary types are Bodies and Targets. An An-
notation oa:Annotation has a single Body, which is
a comment or other descriptive resource, and a sin-
gle Target refers to the resource being annotated. In
our context, the target resource is a dataset or a dis-
tribution. The body may be a general comment about
the dataset, a suggestion to correct or to update the
dataset, a dataset quality evaluation or a dataset rating.
The property oa:motivation may be used to explicitly
capture the motivation for a given feedback. For ex-
ample, when the consumer suggests a dataset update
then the value of oa:motivation will be ”reviewing”.
duv:RatingFeedback and dqv:UserQuality Feed-
back are subclasses of duv:UsageFeedback. The first
one allows to capture feedback about the quality of
the dataset, like availability, consistency and fresh-
ness
7
. The second one allows consumers to evaluate
a dataset based on a grade or a star schema, for exam-
ple.
Feedback could be captured from data consumers
by rating questionnaires, where users are asked to pro-
vide a value point on a fixed scale (Amatriain et al.,
2009). This interaction model could be viewed as a
form of Web collaboration in such way that datasets
evaluation is accomplished by engaging data con-
sumers as “processors” to annotate datasets accord-
ing taxonomies/folksonomies subjects, for example.
Moreover, a data consumer can also provide feedback
annotations to fulfill missing metadata about datasets.
Finally, a user may submit a report based on the prob-
lems they have witnessed when consuming the data.
In general, the annotation model provides the user
with a added value for little effort, because it facili-
tates finding a desirable dataset, as well as, becomes
easier to find similar new resources that could be of
interest (Hotho et al., 2006).
Figure 2: Feedback model.
7
http://www.w3.org/TR/vocab-dqv/
ICEIS 2016 - 18th International Conference on Enterprise Information Systems
232