3.2 Choice and Customization of the
KMS
DSpace, an open source software package developed
in 2000 in the context of a joint project of the
Massachussetts Institute of Technology with
Hewlett-Packard, provides all the necessary tools for
creation and management of an IR based on the
Open Access model. Such an IR can collect, store,
index, preserve and make accessible the information
output created by universities and research institutes
in a digital format.
DSpace is designed as a central storage facility
able to collect all kinds of content from the
community relating to the institution through a user
interface as simple and intuitive as possible. It can
collect various types of digital resources including
text, images, video, audio, articles and preprints,
technical reports, working papers, datasets, and
learning objects directly from the creators.
DSpace was chosen to realize the Analytic
Sound Archive of Sardinia as it fulfills all the
requirements asked by linguists and musicologists. It
is in fact completely customizable, supports natively
Qualified DC metadata schema and is compatible
with OAI with the support of OAI-PMH. The
proposed approach allows to insert the corpus and
the associated knowledge inside of DSpace, ensuring
the maintenance of its structure and the ability to
interrogate and update it easily by adding or
modifying its contents. Each text of the corpus is
inserted into a DSpace item so that it can be
uniquely associated with all of the metadata needed
for the linguistic analysis. The audio file contains the
registrations and the original files with the
annotations are loaded inside of the item as a
bitstream, while the metadata are stored in the
system database.
The first step consisted in the insertion of the
customization of new qualifiers for the Dublin Core
descriptive metadata representation and a new
scheme called "asas" for the representation of the
annotations. When inserting the corpus into DSpace
it was decided to create a specific item for each of
audio clip. It was therefore necessary to set the
release wizard offered by DSpace by changing the
specific XML file responsible for entry forms (input-
forms.xml). The descriptive metadata, identified by
researchers, such as title, author, type of song,
instrument, etc., and all metadata corresponding to
linguistic annotations (phono, morpheme, word,
etc..), was associated to each item, together with the
original file containing the audio recording and the
original file of annotations.
Figure 1: Customization of DSpace metadata's Register.
After the insertion of metadata, the interface was
customized by replacing the standard forms
provided by DSpace using modules specfically
designed to allow the creation of items and the
release of DC metadata according to the specific
needs of the project. The metadata on the
annotations were inserted instead using direct import
because the high number of occurrences for each
item made it difficult to enter them manually, as
shown by Hillman and Westbrooks (2004).
Finally, we proceeded to customize the search
interface of DSpace in order to adapt it to new
metadata and to the particular needs of the Analytic
Sound Archive of Sardinia. In essence, all metadata
corresponding to linguistic annotations needed to be
indexed in DSpace’s search engine so that we could
find a certain audio clip even through the search of
an associated record. Furthermore, some descriptive
metadata such as location, type of performer and
contribution were indexed to allow effective
searching that exploited the granularity of the
metadata.
3.2.1 Metadata Schemas
The metadata are stored and managed by DSpace
through a special tool, the Metadata Registry, where
the Qualifed Dublin Core schema is configured by
default. It can nevertheless be changed, and new
customized schemas can be added. The system
offers two ways to configure the register: one is the
graphic interface named Manakin, and the other can
be used by the terminal. Each of them has a specific
purpose. The first method allows an authorized user
to act on the diagrams through an easy and intuitive
web interface.
Once you create a schema, the metadata can be
added one at a time, with any qualifiers and related
notes. This feature is crucial for the updating and
maintenance of the system as it can make
adjustments quickly and easily without the
intervention of a computer expert. Likewise, you can
KEOD2012-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment
304