2 KNOWLEDGE MANAGEMENT
Organization and availability of contents in KMSs
basically depend on two factors: one is whether
KMS systems have effective tools for information
indexing and retrieval; the other is how those tools
are actually understood and used by users.
The solution to this issue was found in the
experience of the library and archive industry, which
have been dealing with the issues related to
organization and collection of information since way
before the digital revolution. This experience
suggested using metainformation, i.e. data used to
describe and classify information, as a possible
solution. The tools used to enter and manage
contents on the Internet must allow for entering and
retrieving organized and relevant metainformation,
as metadata.
2.1 Metadata
Metadata have thus a fundamental role in organizing
and managing digital resources, especially when
there is a great quantity of available information that
must be indexed and catalogued to facilitate search
and retrieval, as shown by Hillman and Westbrooks
(2004), Strintzis, Bloehdom, Handschuh et al.
(2004), Chopey (2005), Dunsire (2008), Solodovnik
(2011).
The selection of which metadata to use in
describing a resource depends on a thorough
observation of the characteristics, properties,
common features, and differences in the
informational environment the source belongs to.
A metadata schema is a set of structured
metadata, developed for specific purposes in order to
establish a standard of metadata structure and
terminology, and to associate different types of
metadata. Every metadata schema includes a definite
number of elements, called metadata elements, each
with its own meaning and purpose, i.e. describing
the information resource, as shown by Heery and
Patel (2000), and by Lagoze and Van de Sompel
(2003).
However, since standardization is the purpose, it
is always advisable to use largely used metadata
schemas rather than creating new ones. Application
profiles are made of metadata sets derived from
different schemas, and are aimed to create tools for
particular applications while keeping interoperability
with the original base schema. This procedure and
the application of common rules can make different
systems interoperable, like those in libraries,
museums and archives, making them able to share a
part of common metadata.
2.2 The Dublin Core Standard
A support to content management is offered by the
Dublin Core metadata schema, which easily pairs up
with other metadata schemas in the OAI
architecture, improving granularity and refinement
of their structures (Hutt and Riley, 2005).
The rapid spreading of DC as metadata schema
was doubtlessly favoured by its remarkable
simplicity, thanks to which it could adapt to many
kinds of resources and usage environments. It is
important, for a semantic model used in resource
discovery not to be dependent on the format of the
resource it needs to describe.
In the latest years, DC was increasingly used in
many fields to describe, organize, manage, resources
in possession of institutions and international
organizations, and also to support and provide added
value services, assuring a base format for
aggregation and exchange of metadata collections,
such as in the Open Archive Initiative, or as
indispensable search tools in portals (Hillman 2005)
(Jackson, Han, Groetsch and Mustafoff, 2008). The
use of a standardized general classification system
allows for metadata in such collections to be
combined and for knowledge inside each collection
to be shared, as proven by Lunesu, Pani and Concas
(2011).
2.3 Linguistic Annotations and Corpus
The so-called corpus linguistics studies great
quantities of linguistic productions, either spoken or
written, by observing their characteristics: lexicon,
syntax, collocations, phonic chain, morphologic
structures, etc. Computational linguistics, in order to
aid this study, developed the first automated or semi-
automated text analysis information tools, avoiding
manual analysis and data research.
A corpus is any complete and orderly collection
of written texts, by one or more authors, on a certain
topic, or, linguistically speaking, the sample of a
language as examined in the description of the same
language.
In order to exploit the wealth of information
stored in a corpus as linguistic data, the corpus must
be enriched with additional information: linguistic
annotations, i.e. the adding of linguistic or
metalinguistic information to different portions of a
text, as shown by Llisterri (1996) and in the
EAGLES Project.
KnowledgeFormalizationandManagementinKMS
133