Position Paper: Ontology in the Rail Domain
The Railway Core Ontologies
Christopher Morris, John Easton and Clive Roberts
Birmingham Centre for Railway Research and Engineering, University of Birmingham, Birmingham, U.K.
Keywords:
Railway Domain Ontology, Data Integration, Railway Core Ontologies.
Abstract:
This paper presents the railway core ontologies, a group of related ontologies designed to model the rail
domain in detail. The purpose of these ontologies is to enable improved data integration in the rail domain,
which will deliver business benefits in the form of improved customer perceptions and more efficient use of the
rail network. The modularity of the ontologies allows for both detailed modelling of the domain at a high level
and the storing of instance data at lower levels. It concludes that the benefits of improved rail data integration
are best realised through the use of the railway core ontologies.
1 INTRODUCTION
The railway core ontologies are a set of related mod-
els that capture the rail domain in depth, using a
modular structure to ensure that the ontology is as
lightweight as possible for any given application.
In the UK the technical strategy leadership group
reported that “Excluding Network Rail’s own infor-
mation systems, research discovered over 130 infor-
mation systems maintained by approximately 20 sup-
pliers” (Technical Strategy Leadership Group, 2012).
Previous projects also note the fragmented and pro-
prietary nature of the existing information systems. It
has been found that “Where electronic data exchange
standards for rail do exist, many are proprietary bi-
nary formats used to provide point-to-point interfaces
between specific systems and not intended for use in a
generalised context.” (Easton et al., 2013a). This sit-
uation caused the regulatory body responsible for the
oversight of the UK rail domain, the Office of Rail
Regulation, to express concerns as to the current state
of asset condition monitoring (Office of Rail Regula-
tion, 2012).
One area which will give rise to increased volumes
of data is that of remote condition monitoring as de-
scribed in (Garcia Márquez et al., 2003). Typical ap-
plications are remote sensors to report on the impend-
ing failure of railway systems such as points, in the
case of infrastructure, or bearings in the case of vehi-
cles. Similarly capable of generating large volumes of
data are inspection vehicles, which regularly run over
the rail network, to inspect the fixed infrastructure.
These often record video, and many sensor readings
per second along with the time and location of the
reading. Data volumes are discussed in the context of
the US rail network by Allan Zarembski (Zarembski,
2014) and in the context of European rail networks
by Núñez et al (Núñez et al., 2014). Numerous stud-
ies, including (García Márquez et al., 2008), suggest
benefits in the form of a reduction in unplanned, reac-
tive, maintenance and a reduction in manual inspec-
tion, may be derived from asset condition monitoring.
Another area experiencing growth is passenger in-
formation. Studies, such as those reported by Dziekan
and Kottenhoff in (Dziekan and Kottenhoff, 2007)
show that customer perceptions and ridership can be
improved by improving customer information.
There already exist a number of ontologies, such
as the REWERSE project, reported in (Lorenz, 2005),
that cover the broader transport domain, though these
do not aim to create a domain ontology for the rail
industry. This does include the locations of inter-
changes between modes (including rail) and the path
(both geographic and network) taken by rail lines,
along with journey planning information. The Ark-
TRANS project serves the multi-modal transport do-
main and has been employed by the Norwegian State
Railway company (NSB) to produce a functioning
journey planner along with being the basis for a num-
ber of other projects. ArkTRANS is introduced in
(Natvig and Westerheim, 2003). Further work has
been done implementing the concepts set out in the
original project, including a domain ontology for
freight transport, set out in (Gönczy et al., 2012) with
Morris, C., Easton, J. and Roberts, C..
Position Paper: Ontology in the Rail Domain - The Railway Core Ontologies.
In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) - Volume 2: KEOD, pages 285-290
ISBN: 978-989-758-158-8
Copyright
c
2015 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
285
some cross over to the rail network and a project re-
lating to the maritime domain (Ørnulf, 2011).
Technology has matured since the use of ontol-
ogy for the rail domain was first proposed by the In-
teGRail project, the final report for which was com-
pleted in 2010 (Köpf, 2010). Much of the work done
for that project however pre-dated the availability of a
number of the tools employed by RaCoOn.
The railway core ontologies aim to allow the rail
domain to share information more readily, for exam-
ple, when a new type of asset is added, its status
can be inferred without further software development.
TThis is discussed by Tutcher et al (Tutcher et al.,
2013), where it is used by both industry and academia.
2 BENEFITS OF USING THE
RAIL CORE ONTOLOGY
The benefits of using the rail core ontology stem from
the ability to draw together disparate data sources.
The following section outlines usage scenarios for the
railway core ontologies.
As an example consider component life-cycle in-
formation stored in a maintenance database. There is
little value in knowing how many times a given com-
ponent has functioned without knowing how many
times it can be expected to function. Taking this a
step further, knowing how a given component per-
forms when it is in good working order and how it
degrades allows for better informed maintenance - it
is possible to replace it before it fails. Another exam-
ple in passenger information systems is knowing that
a train passed a certain signal and was travelling at a
certain speed is useful for basic platform boards, how-
ever, knowing that there are speed restrictions further
along the line allows for more accurate information.
The railway core ontologies serve this purpose
well; not only do they serve as a standard for data in-
terchange, as do XML standards used within the rail
industry (railML), but they allow for any stakeholder
to add their own custom extensions without a need to
consult standards committees and thus avoid the de-
lay that imposes. If, for example, a company is work-
ing on a new product then they can simply create any
extensions to the ontology it may require. These ex-
tensions could then be released at the same time as
the product. Furthermore as the ontologies naturally
grow less functionality will need modelling - it will
be possible to make available product documentation
using existing ontological constructs. For this reason
nothing is limited to pre-defined lists.
The use of ontological reasoning has benefits
above and beyond those from the data integration.
Reasoning can for example be employed to infer the
condition of assets, the location of trains or availabil-
ity of routes, from incomplete data.
There are other benefits of moving the logic out of
the ‘front end’ and into the ontologies. When the front
end is updated to reflect a changing IT environment
the ontologies need not be altered, conversely edit-
ing the ontology need not imply development work.
Since changes to the ontology need not be a task for
a developer then as changes are made to the rail net-
work any engineer can update the relevant ontology
to reflect them. Providing tools to make this possible
are part of the scope of this project.
2.1 Railway Core Ontology Usage
Scenarios
Discussion with stakeholders, such as the UK’s in-
frastructure manager, Network Rail, and a large en-
gineering firm along with work stemming from the
Factor-20 project, (Roberts et al., 2011), has identi-
fied the following scenarios where the railway core
ontologies would add value.
2.1.1 Scenario 1 – Forward Planning
A scenario suggested by the UK infrastructure man-
ager was the ability to answer forward looking “what
if” questions relating to the costs and benefits of as-
set maintenance. The example given was of a bridge
which is found to be in a condition such that it will
soon need heavy maintenance. In this example the
available options were to repair or replace it, with a
further choice as to the loading gauge of the replace-
ment. Whilst exact costs will vary on a case by case
basis it should be possible to estimate costs from pre-
vious work. Questions such as how much revenue
would be lost were the bridge to be replaced more
cheaply with one not gauged for freight as well as the
relative cost of repair versus replacement needed ex-
pert assessment, which was done on the basis of less
than ideal data. Data currently resides in a number
of silos: the finance system, the technical drawing
document management system and train movement
databases being three of the most prominent.
2.1.2 Scenario 2 – Maintenance Timing
A similar forward looking question answering sce-
nario, put forward by the UK infrastructure manager,
that brings together cost centre and operational infor-
mation is determining the best time to carry out main-
tenance. When maintenance is best timed, from a rev-
enue loss perspective is a non trivial question. To an-
swer it information as to the number and ticket type of
KEOD 2015 - 7th International Conference on Knowledge Engineering and Ontology Development
286
passengers must be brought together with information
regarding outstanding maintenance items, whilst also
considering the needs of freight operating companies.
The ontologies could also be of use more directly
in the domain of asset maintenance. This would take
available information from the physical network and
make it available at both an operational level for day
to day work and a corporate level for long term plan-
ning.
2.1.3 Scenario 3 – Train Identification
The linking of trains with alerts from track-side mon-
itoring equipment could be more efficient using the
ontologies. Wheel impact load detectors, a device for
detecting trains with misshapen wheels, are a good
example of this. These track mounted sensors mea-
sure how much force a wheel strikes the track with
and, if the force is too great, the wheel is known to
be defective and liable to damage the track. The only
data available from the detector is the time and force
of the impact. Where data from these sensors is found
to be exceptional it is currently a manual process to
identify the offending wheel and repair it. Data from
one system is printed, assessed, phone calls made,
the problematic train identified and the maintenance
department of the train operating company (who are
distinct from the infrastructure owners in most Euro-
pean countries) informed. Train movement data can
tell you in what order trains passed down a given
line. Making that information available, along with
the time of impact and axle count from the wheel im-
pact load detector would allow a system using ontolo-
gies to infer which wheel of a particular train had trig-
gered the detector. This concept was also studied as
part of the FuTRO project (Tutcher et al., 2013).
2.1.4 Scenario 4 – Predictive Maintenance
Whilst planned maintenance has a lower total cost
than corrective or condition-based maintenance, pre-
dictive maintenance is more cost effective still, as
stated by Umiliacchi et al. (Umiliacchi et al., 2011).
In the case of planned maintenance, assets are re-
paired on a fixed schedule, regardless of condition;
in the case of predictive maintenance, work is done
when a computer model, informed by sensor data,
predicts it will be necessary. The intervening step of
condition-based maintenance has been taken up for
certain asset types, but often only with the use of
crude thresholds and manual inspection. Generally,
measurement equipment, be it integrated in a points
motor or attached to a measurement vehicle, sends
back continuous values for sensor readings, but these
are only taken into account should they be outside of
the specified envelope. When the reading exceeds the
specified maximum or minimum, the asset is main-
tained, thus remaining in a safe condition.
Predictive maintenance goes a step beyond this
using the trend exposed by the rate and direction of
change of the underlying data to suggest when main-
tenance would best be scheduled. Whilst this is possi-
ble on an asset by asset, system by system basis, with-
out the use of the railway core ontologies this process
would be very slow and costly, to such an extent that
it is unlikely to be undertaken on a large scale. The
railway core ontologies would allow the implementa-
tion of a network wide system with only asset specific
data being added, rather than extensive customisation.
2.1.5 Scenario 5 – Customer Information
Train location information is currently disseminated
via a system known as DARWIN, for the purposes of
passenger information. This takes the available train
position data from track-side infrastructure, the accu-
racy of which varies depending on source, and com-
bines it with timetable data. This is then displayed on
the passenger information boards in stations as well
as on a range of websites and mobile applications.
Efforts, described by Easton et al. (Easton et al.,
2013b), are underway to integrate data from GPS
units on trains with that from traditional track circuits
(regularly over a km in length) to improve accuracy.
Track circuits are electric circuits used to ascertain
which section of track a train is on. This is an area in
which the railway core ontologies can help; however
there are far greater benefits to be derived when inte-
gration with other transport modes is considered. In
order as to plan a multimodal journey it is necessary
to have information from all modes available and inte-
grated in one place. Here the rail core ontology is not
the only one that needs consideration; work based on
the Arktrans model (Natvig and Westerheim, 2003)
also needs consideration, as does the General Tran-
sit Feed Specification, itself available as linked data –
discussed in (Santos and Moreira, 2014).
3 IMPLEMENTATION
The strength of the railway core ontologies comes
from their modularity. The rail domain is very large
and a model which describes all of it in detail would,
with the currently available technology, be unsuitable
for high volume information retrieval tasks.
Several of the implementation decisions taken
build upon the work of the first project focused on
a domain ontology for the rail domain, the Inte-
Position Paper: Ontology in the Rail Domain - The Railway Core Ontologies
287
GRail project. This, as summarised in (Köpf, 2010),
aimed to “allow IMs[Infrastructure Managers] and
RUs[Railway Undertakings] across Europe to act as
a single system.” Amongst other deliverables a “Net-
work Statement Checker” was produced as a demon-
strator. This is a tool to ascertain whether a given
train configuration can run over a (often transnational)
route. This will be based on the availability of auto-
mated signalling systems, gauge, electrification and
other factors which vary from area to area. This work
was limited primarily by the available technology
whilst the report was published in 2010 it was written
in 2009 and not all of the tools employed in the im-
plementation of the railway core ontologies existed at
that point.
The architecture of the broader system of which
the ontologies will form a part also requires consider-
ation. As originally suggested by InteGRail the sys-
tem will use a Service Orientated Architecture (SOA).
The system will by its nature be distributed across
many stakeholders. Whilst questions of ‘ownership’
of information in this context are complex, the solu-
tion will involve physically disparate systems owned
not merely by differing stakeholders, but at times,
commercial competitors.
3.1 Provenance
The number of stakeholders gives rise naturally to is-
sues of trust and provenance. These are being tackled,
in part, by the use of the prov ontology
1
. The issue of
access to information is further addressed by the use
of triple stores that allow selective visibility - for ex-
ample allowing users access to selected graphs.
3.2 Internal Organisation
The modular design of the ontologies is key to
their practicality. The ontologies themselves are com-
posed of a number of different namespaces only a
subset of which need to be imported for any given
task. The ontologies break into three broad cate-
gories:
1. An upper ontology, which may easily be mapped
onto other upper level ontologies as needed. The
upper level deals with concepts such as obser-
vation and place. Figure 1 shows a selection of
classes from this layer.
2. A core ontology, which can be either three or four
dimensional, depending on which file is imported.
When working with large volumes of data it is en-
visaged that the 3D version will be used, whereas
1
http://www.w3.org/TR/prov-o/
Figure 1: The upper ontology - http://purl.org/ub/upper.
the 4D version is better suited to model exchange.
Reification as a means of representing changes
over time was dismissed owing to the prolifera-
tion of triples it engenders and the reduction in
OWL reasoning capability it causes.
3. Task ontologies: Task specific ontologies that
represent a deep study into a particular area.
Timetabling, Rolling Stock and Infrastructure
have been implemented, with more expected to
follow. Figure 2 shows a selection of classes from
the Rolling stock module. Note that this includes
such things as locomotive models.
Figure 2: The Rolling Stock sub-ontology.
Note that whilst all lower levels depend on the levels
above, but none of the higher levels depend on any
level below.
The way the namespaces and thus modules of the
ontology inter-relate may be observed from Figure 3,
which shows the namespaces used by the rolling stock
module. The namespace entitled “vocab” forms a part
of the core ontology, “u” is the upper level. Note that,
where possible, external namespaces are used.
Figure 3: Rail Core Ontology Namespaces.
Another implementation choice was that of com-
plexity. Here there is a clear trade off: expressivity vs
computational complexity. This trade-off has been the
subject of much work, both at this centre and within
KEOD 2015 - 7th International Conference on Knowledge Engineering and Ontology Development
288
the wider community. The result was that each mod-
ule is split into two parts: one containing the T-Box,
which can itself be large, and another containing con-
straints which can be far more expressive. The con-
straints ontology must then be kept as small as possi-
ble. In this case the T-box is where possible OWL-RL
compliant. A-Box raw data uses RDF schema reason-
ing whereas the constraints are as expressive as nec-
essary. The triple store with which this was developed
- stardog - allows for constraints to be reasoned over
separately and at a different complexity to other parts
of the ontology.
3.3 The FuTRO project
The FuTRO project “aimed to show how shared, open
access ontologies and linked data could help the UK
rail industry” (Tutcher et al., 2013). The Railway
Core Ontologies provided the network model used by
this project, in particular information as to which as-
set was monitored by which sensor and how these as-
sets interacted was stored. Listing 1 is an exert of the
pattern used to this end.
Listing 1: Observation Pattern. Note that comments and
labels have been omitted to improve readability.
:associatedObservation
a owl:ObjectProperty ;
rdfs:domain u:IndependentThing ;
rdfs:range u:Observation ;
owl:propertyChainAxiom ( :monitoredBy :
observedEvent ) ;
owl:propertyChainAxiom ( :dependsOn :
monitoredBy :observedEvent ) .
A demonstrator was developed, which used in-
ference to show which lines were unavailable in the
event of mechanical failure of railway switches. Sim-
ilarly the ontology has provision for storing the phys-
ical location of the asset such that maintenance teams
can get get to it.
The project has suggested that very high volume
raw data be stored separately to the ontology and only
a key to it be retained. In the case of the FuTRO
project this was accomplished using the REDIS
2
key
value store where appropriate. Further extensions are
envisaged to allow the ontology to advertise the web
services that could be called to retrieve that data.
Another factor aiding implementation at this time,
which was not available at the time of the InteGRail
project, is the existence of libraries for working with
linked data. The FuTRO project made used of the dot-
NetRDF
3
to allow software written in the “.Net” fam-
2
http://redis.io/
3
http://dotnetrdf.org/ a comprehensive and well docu-
mented open source library for interfacing with triple stores
ily of languages to interface with triple stores, which
removed one of the barriers to integration with the
project industrial partners.
As is stated in the final report (Tutcher et al.,
2013) two demonstrators were produced, along with
that report. The train location demonstrator is pub-
licly available and accessible at: http://purl.org/rail/
trainlocator
It demonstrates the use not only of linked data but
also ontological reasoning, to infer a train’s location
even when it is no longer available in the original for-
mat.
4 FUTURE WORK
Although the core ontologies already capture enough
of the rail domain to demonstrate value many areas
require further modelling effort. Remedying this will
require input from domain experts and as such tools
need to be provided to allow railway engineers to edit
the ontology. In order to align with existing indus-
try best-practice for the procurement of new ICT sys-
tems, these tools should, where possible, be commer-
cial off the shelf software (COTS).
There is also a need to enrich samples of existing
data, documenting the process and making available
sample code for use by stakeholders in the rail in-
dustry. Tools enabling conversion from the popular
“railML” XML format would enable a quicker transi-
tion.
Tools to query the ontology and present the results
to non-technical users are also required. Network Rail
has expressed an interest in work around presenting
Key Performance Indicators to senior managers for
example, beyond the work outlined in Section 2.1.
The prioritisation and selection of data sources
for conversion also requires consideration. The pos-
sibility of using a combination of contractual com-
pulsion and reciprocal data exchange, to encourage
those stakeholders that supply assets (railway infras-
tructure, vehicles etc.) to supply them along with data
sheets in a format compatible with the railway core
ontologies needs examination. Other existing data
will need to be prioritised according to its value to
stakeholders.
The industry also requires design patterns for the
larger system and guidance for using these patterns.
Many means exist for webservices to advertise there
availability and this needs to be standardised such
that all stakeholders can access information. Distri-
bution and ownership of information issues also need
addressing - the current solution, discussed in Sec-
tion 3.1 is only partial.
Position Paper: Ontology in the Rail Domain - The Railway Core Ontologies
289
5 CONCLUSIONS
It is clear that much can be achieved by bringing to-
gether data from different sources; data currently re-
sides in isolated silos and value for all stakeholders
exists where it is brought together. This value exists
as money saved by more efficient maintenance, lower
costs incurred due to equipment failure, less staff time
wasted manually compiling reports and less long term
IT expenditure. Beyond the savings there is also value
from improved passenger perceptions: improved cus-
tomer information has been shown to improve cus-
tomer perceptions of the service. Twinned with im-
proved reliability this has potential to increase rider-
ship and freight traffic.
The use of an ontology as opposed to lighter
weight, data only, standards has multiple advantages:
It is easier and quicker to modify, without affecting
front-end applications. The ontologies can be mod-
ified more quickly than could a traditional standard.
Information from one sub-domain can be reused in
another, and can be combined to easily obtain further
information. The ontology can be self documenting.
The disadvantages are skills, which are less common
than traditional IT skills and computational complex-
ity. These can both be overcome; the lack of IT skills
by provision of well designed software and complex-
ity by carefully addressing the trade off between ex-
pressivity and complexity discussed in Section 3.
All these advantages suggest the Rail Domain On-
tologies are the best way to bring together data in the
UK Rail Domain.
REFERENCES
Dziekan, K. and Kottenhoff, K. (2007). Dynamic at-stop
real-time information displays for public transport: ef-
fects on customers. Transportation Research Part A:
Policy and Practice, 41(6):489–501.
Easton, J., Chen, L., Davies, R., Tutcher, J., and Roberts, C.
(2013a). Ontime - Part D7.1. Technical report.
Easton, J., Chen, L., Davies, R., Tutcher, J., and Roberts, C.
(2013b). RSSB Cross industry Railway Information
Systems Workshop 7 th December 2012. Technical
Report January.
García Márquez, F. P., Lewis, R. W., Tobias, A. M., and
Roberts, C. (2008). Life cycle costs for railway condi-
tion monitoring. Transportation Research Part E: Lo-
gistics and Transportation Review, 44(6):1175–1187.
Garcia Márquez, F. P., Schmid, F., and Collado, J. C.
(2003). Wear assessment employing remote condition
monitoring: A case study. Wear, 255(7-12):1209–
1220.
Gönczy, L., Kövi, A., Andreas, P., Thalhammer, A., and
Stavrakantonakis, I. (2012). e-Freight ontology. Tech-
nical Report 233758.
Köpf, H. (2010). InteGrail – Publishable Final Activity Re-
port. Technical report.
Lorenz, B. (2005). A1-d4 ontology of transportation net-
works.
Natvig, M. K. and Westerheim, H. (2003). Joint effort on es-
tablishment of ARKTRANS–a system framework ar-
chitecture for multimodal transport. Technical report.
Núñez, A., Hendriks, J., Li, Z., Schutter, B. D., and
Dollevoet, R. (2014). Facilitating Maintenance De-
cisions on the Dutch Railways Using Big Data: The
ABA Case Study. In Big Data (Big Data), 2014 IEEE
International Conference on, pages 48–53, Washing-
ton, DC.
Office of Rail Regulation (2012). Network Rail Monitor -
Quarter 1 of Year 4 of CP4 |1 April 2012 21 July
2012. Technical Report July.
Roberts, C., Easton, J., Sharples, S., Golightly, D., and
Davies, R. (2011). Rail Research UK Feasibility Ac-
count ( FA ): Factor 20 reducing CO 2 emissions
from inland transport by a major modal shift to rail (
EP / H024743 / 1 ). Technical report, Birmingham.
Santos, M. Y. and Moreira, A. (2014). Integrating Public
Transportation Data: Creation and Editing of GTFS
Data. 2:53–62.
Technical Strategy Leadership Group (2012). The Indus-
try’s Rail Technical Strategy 2012. The Rail Technical
Strategy 2012. Technical report.
Tutcher, J., Easton, J., Roberts, C., Myall, R., Hargreaves,
M., and Tiller, C. (2013). Ontology-based data man-
agement for the GB rail industry Feasibility study.
Technical report, RSSB.
Umiliacchi, P., Lane, D., Romano, F., and SpA, A. (2011).
Predictive maintenance of railway subsystems using
an Ontology based modelling approach. pages 22–26.
Zarembski, A. M. (2014). Some Examples of Big Data in
Railroad Engineering. In IEEE BigData 2014, Wash-
ington, DC.
Ørnulf, R. J. (2011). A Maritime ITS Architecture for e-
Navigation and e-Maritime: Supporting Environment
Friendly Ship Transport. pages 1156–1161.
KEOD 2015 - 7th International Conference on Knowledge Engineering and Ontology Development
290