Long Term Preservation / Access to the
repository’s content: A significant step towards
preserving digital assets and ensuring their
accessibility in the long term is to develop a digital
repository conformant to the OAIS specification.
Content should also be characterized by the ability
of permanent access (e.g. by persistent URI’s). At
the same time the repository should maintain
multiple versions of the content, because digital
objects may be modified / updated from time to time
(persistency).
Metadata: Metadata should be used during the
whole life-circle of the digital content. The main
objectives are description of digital content, support
for its management and facilitation of access to it,
even in the long term (descriptive, administrative,
preservation metadata). However, it is important for
the metadata to follow a widely adopted standard
(Dublin Core (DCMI 2003) can be used in general,
MPEG-7 (MPEG 2003) for multimedia content,
DIG-35 (DIG 2000) for digital images and METS
(DLF 2003) for wrapping and encoding all the
above). Furthermore, there are cases where metadata
should not only be used for describing individual
repository objects but should also support a higher
level of abstraction, i.e. the collection level (Dublin
Core-Collection Level Description).
Interoperability / Import-Export Capability:
Interoperability can be achieved by adopting well-
known standards during the repository’s
development. One of them is the platform-
independent language XML (and XML Schema).
Implementation of the OAI-PMH protocol (Lagoze
and Van de Sompel 2001), (Van de Sompel and
Lagoze 2002) is highly recommended in order to
accommodate mass metadata import/export to and
from the repository. Support for the Z39.50
(ANSI/NISO 1995) protocol is also of crucial
importance, especially for transparent and remote
search in a huge amount of documents.
Interoperability and accessibility of the digital
repository are enhanced by exposing its services as
Web Services. Practically, this means that the
services will be described using the WSDL language
(W3C 2003) and registered with some UDDI
registry (OASIS 2002). The major benefit of UDDI
is that it enables the automate discovery (and
possibly utilization) of a Web Service by the
machine, similar to the way that physical users use
search engines. Recently, attention seems to draw
the ZING Initiative (“Z39.50 International: Next
Generation”) (Z39.50 IMA 2003) and especially its
SRW (“Search/Retrieve for the Web”) part. SRW is
a web-service-based protocol which aims to
integrate access across networked resources, and to
promote interoperability between distributed
databases by providing a common platform. It
features XML and SOAP and thus it is able to
integrate more tightly with XML-based
infrastructures.
Security/ User Certification: It is clear that none
but the Designated Community will be allowed to
access the repository’s content. A practical way to
achieve this is to establish a set of access policies for
each Consumer or Consumers' Community, to
support their authentication using login/password
pairs and/or digital certificates and to cipher access
to the repository’s services (e.g. SSL).
Intellectual Property Rights Management: The
need for copyrighting original content and for
economic exploitation of the repository necessitates
the management and encoding of IPR information
into the content. Watermarking not only for digital
images but also for any type of multimedia content
is widely used. At a metadata level, we indicatively
mention the XML-based MPEG-21, Part 5: Rights
Expression Language (MPEG 2002) and the W3C’s
XML security suite (XML Encryption, XML Key
Management and XML Signature).
Knowledge Representation / Management:
Repository's content will not be restricted within
only one thematic domain, but it may also span over
several domains or their combinations.
Consequently, it is convenient to describe the
content in a semantically hierarchical and structural
way. In other words, the establishment of ontologies
for each content domain is proposed. For example,
the CIDOC ontology (CRM-Conceptual Reference
Model (Crofts et al. 2001) can be used for the
cultural heritage domain. An ontology-enabled
system can assist the user in his search by supporting
automated reasoning, even if the information being
sought is not explicitly defined in the metadata.
Ontologies can also be used for the management of a
digital repository; e.g. ABC (Lagoze and Hunter
2001) is capable of organizing events that occurred
in the repository at any moment. Traditionally, RDF
is used for the development of ontologies; however
the DAML+OIL (McGuinness et al. 2002) and the
more recent OWL language (W3C 2003) are
recommended, as they are specifically designed for
ontologies.
4 DIGITAL REPOSITORY
FUNCTIONAL MODEL
Implementing the above requirements results to the
following Functional Model of the digital repository
TOWARDS THE DEVELOPMENT OF A GENERAL-PURPOSE DIGITAL REPOSITORY
273