merged into a single code-base. The most common
application for version control is the management of
changes in program source code and here the most
widely used system is CVS Cederqvist (2003) al-
though more recently competitors such as Subversion
Collins-Sussman et al. (2004) have been developed.
The majority of version control systems record
different versions of documents in their store, opti-
mising the store to make common operations inex-
pensive. For example, CVS stores the most recent
version of each file and a set of patch files that encode
the changes needed to revert to the previous version.
Stepping back to the previous version is fast, but step-
ping back a long way requires the sequential applica-
tion of patch files.
A recent development in version control has been
the use of a distributed development model where
multiple copies of the version control repository are
maintained. In the CVS model, there is one central
canonical copy of the version history that all develop-
ers use to send and receive updates. In the distributed
model supported by Darcs, Arch Lord (2006) and Bit-
keeper
1
among others, each developer has a copy of
the version history and changes can be registered lo-
cally and passed between developers without the need
for a central server.
2.1 Version Control and RDF
The On-to-Knowledge project has published a num-
ber of papers on managing changes in RDF ontologies
Klein et al. (2002); Kiryakov and Ognyanov (2002).
The focus of much of their work is on the description
of different versions of ontologies although Kiryakov
and Ognyanov (2002) note that much of their method-
ology applies equally to plain RDF descriptions as
it does to RDF(S) schema expressed in RDF. How-
ever, their main goal is to track changes, not manip-
ulate them and their proposal doesn’t claim offer a
full version control mechanism. Meta-data is asso-
ciated with specific versions of an ontology allow-
ing users to identify appropriate versions and reason
about the compatibility of newer and older versions.
Their papers contain some relevant work on automat-
ically comparing ontologies to find differences.
When semantic web data is viewed as a knowl-
edge store from which new knowledge will be de-
duced, a major concern is maintaining the consis-
tency of the knowledge store. This problem has a lot
in common with the version control problem in that
consistency must be maintained in the face of addi-
tion and deletion of assertions. Deletion especially
1
http://www.bitmover.com/
presents the problem of finding what deduced knowl-
edge should be removed to keep the knowledge store
consistent. Broekstra and Kampman’s work on truth
maintenance Broekstra and Kampman (2003) is con-
cerned with tracking the deductive dependencies be-
tween statements in the RDF graph to avoid having to
re-run deductive processes when statements are added
or deleted from the graph.
The source or provenance of data is also a ma-
jor concern when reasoning about knowledge drawn
from diverse locations around the Semantic Web. A
number of projects are investigating the representa-
tion of the provenance of statements within the RDF
model enabling reasoning about both a fragment of
data and its source. While the RDF reification mech-
anism provides one way of making statements about
statements, as pointed out by Watkins and Nicole
Watkins and Nicole (2006) reified statements cannot
be used in semantic inferences. They propose the use
of the Named Graph mechanism as implemented in
Jena Bizer et al. (2005) to record provenance infor-
mation about statements in the graph. They have used
this mechanism to implement a software version con-
trol system Watkins and Nicole (2005) which uses
RDF to describe changes to text based source code.
An interesting result of this work is the ability to rea-
son about the version meta-data using RDF tools to,
for example, find instances where a developer reverts
a change made by another developer. The use of RDF
in this role adds a useful additional capability to the
normal version control model.
Some recent work has directly directly addressed
the problem of version management for RDF knowl-
edge bases. The recently released IBM BOCA Sys-
tem
2
is an RDF repository that supports rollback of
transactions in the RDF store. Very little informa-
tion is available at this time about how this is imple-
mented or the capabilities of the model used. Auer
and Herre Auer and Herre (2006) present a model
based on atomic changes to RDF triple stores which
enables a kind of transaction based version manage-
ment enabling changes to be rolled back if neces-
sary. The main focus of the paper is on the appli-
cation of change management to the evolution of on-
tologies and the authors discuss some ideas for evolu-
tion patterns which enable them to characterise sets of
changes to an ontology as, for example, adding a new
class or changing the cardinality of a property. The
intention of these patterns is to provide more a useful
change history to a human author than if the raw RDF
changes were shown. The goals of this work are very
close to that of our own project and there are many
2
http://ibm-slrp.sourceforge.net/v1/wiki/
index.php/BocaUsersGuide
ICSOFT 2007 - International Conference on Software and Data Technologies
6