Data publication principles, OntoWiki makes all re-
sources accessible by its URI. Furthermore, for each
resource used in OntoWiki additional triples can be
fetched if the resource is dereferenceable. Pingback
is an established notification system that gained wide
popularity in the blogsphere. OntoWiki adapts the
pingback idea known from the blogsphere to Linked
Data providing a notification mechanism for resource
usage (Tramp et al., 2010a). If a Semantic Pingback-
enabled resource is mentioned (i. e. linked to) by an-
other party, its pingback server is notified of the us-
age.
For exploring semantic content, OntoWiki pro-
vides several exploration interfaces: The compromise
of, providing a generic user interface aiming at be-
ing as intuitive as possible is tackled by regarding
knowledge bases as information maps. Each node at
the information map, that is, RDF resource, is repre-
sented as a Web accessible page and interlinked to re-
lated digital resources. The full-text search makes use
of special indexes if the underlying knowledge store
provides this feature.The resulting SPARQL query is
stored as an object which can later be modified (e. g.
have its filter clauses refined). For domain-specific
use cases, OntoWiki provides an easy-to-use exten-
sion interface. By providing such a custom view, it
is possible to hide the fact an RDF knowledge base is
being worked on. This permits OntoWiki to be used
as a data-entry frontend for users with a less profound
knowledge of semantic technologies. Via its facet-
based browsing, OntoWiki allows the construction of
complex concept definitions, with a pre-defined class
as a starting point by means of property value restric-
tions.
3 MULTIMODAL SEMANTIC
CONTENT
For handling large amounts of multimedia data, au-
tomatic processes for managing this kind of content
have been developed and integrated into OntoWiki.
They allow to import arbitrary multimedia documents
(13 different file types are currently supported) or
even complete directory structures into a knowledge
base and manage them subsequently with OntoWiki,
using the techniques presented in Section 2. The
workflow for importing multimedia documents is pre-
sented in Figure 2 and described in the sequel.
Extracting Multimedia Metadata. We developed
a framework, which detects certain formats (from the
more than 1000 different registered MIME types).
Discogs, Flickr,
MusicBrainz...
literal value
literal value
literal value literal value
query
query
response
response
save
Multimedia
Vocabulary
define
Input Extraction Representation
Linking
Metadata
Figure 2: Multimedia metadata extraction, representation
and interlinking process.
The framework is highly configurable and easily ex-
tensible, thus allowing to easily integrate support for
new multimedia types and to configure the properties
and classes used to create the semantic metadata. The
extraction of multimedia metadata is realized as fol-
lows:
1. Extraction of Metadata Attributes. Infor-
mation about the file name, size or date of creation
is extracted. In addition to those information, many
multimedia formats already contain metadata specific
to their field of use. Such information is most likely
arranged in key-values pairs in the file’s header. For
instance, music files usually contain ID3 tags, im-
ages taken by digital cameras include an EXIF header.
The MIME type of the file is determined and subse-
quently a specialized metadata extractor is initialized.
The framework is designed in a way that every meta-
data extractor manages a set of extensions, each one
being responsible for the extraction of a single meta-
data type on its own. These extensions are executed
consecutively, thus giving the opportunity to re-use
already extracted metadata and accelerate the extrac-
tion process.
Previews of PDF or video documents are created.
Other examples of metadata extraction extensions are
the number of pages of a PDF document or the geo-
coordinates of an image.
2. Integration of Additional Information. The
previously extracted metadata is now used to obtain
and integrate additional information, which is not ex-
plicitly contained in the processed files. For example,
an artists name extracted from the music’s file ID3
information may be used to look up a URI for this
artist on the Data Web. Likewise, traditional non-RDF
based web-services may be used to gather additional
information (e.g. the album cover for a song).
Representing Multimedia Metadata. To represent
the extracted metadata in RDF we reused well es-
tablished vocabularies (cf. Figure 3). The rationale
WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies
292