The Mid Level Data Collection Ontology (DCO)

Generic Data Collection using a Mid Level Ontology

Joel Cummings and Deborah Stacey

School of Computer Science, University of Guelph, Guelph, Ontario, Canada

Keywords:

Mid Level Ontology, Data Collection Ontology (DCO), BFO, Data Collection, OBO Foundry, Foundational

Ontology, Upper Level Ontology, Domain Ontology.

Abstract:

Capturing data through an ontology is a common goal where instances exist as datums mapping to universal

terms deﬁned in an ontology. Currently these ontologies lack a shared conceptualization for data collection

terms. We propose a mid level Data Collection Ontology (DCO) that deﬁnes data collection terms in a domain

agnostic way enabling extension for domain ontologies to build off of. Such an ontology should provide

reasoning support and enable automated error detection required by all data collection ontologies. By using the

Basic Formal Ontology (BFO) as its base it enables existing OBO foundry ontologies to extend the proposed

ontology in their design allowing existing domain level ontologies an entry point.

1 INTRODUCTION

Collecting data is a common purpose for an ontology

whose terms, descriptions, and relationships describe

universal categories that collected data are arranged

under as instances. Due to the requirement of domain

terms these ontologies are created at the domain level

meaning they only seek to deﬁne terms that reﬂect

the particular domain they operate in, ignoring hierar-

chies and more general terms that may apply to other

areas. The result is an ontology that deﬁnes data col-

lection with a domain speciﬁc view of the world.

We deﬁne ontology as a shared conceptualization

that should seek to deﬁne terms in their most for-

mal regard and should not use terms that are speciﬁc

to particular areas wherever possible (Gruber, 1995).

The data collection components of domain speciﬁc

ontologies provide little in the way of reuse poten-

tial and violate the idea of shared conceptualization

in deﬁning data collection terms. Ontology develop-

ers should therefore strive to produce solutions that

enable reuse among other ontologies instead of re-

deﬁning terms and patterns that can be deﬁned once

and shared (Gruber, 1995). It is for this reason upper

level ontologies exist that seek to deﬁne terms that are

necessary for any ontology. An example is the Basic

Formal Ontology (Bas, 2017) which deﬁnes domain

neutral terms that can be used for all ontologies due

to their high level of formalism and domain agnosti-

cism. BFO starts by organizing terms by where they

sit in the world based on if they exist in time space.

Therefore, upper level ontologies seek to serve all

ontologies regardless of domain or purpose. Mid level

ontologies are a level deeper and build off of upper

level ontologies by providing terms that apply to on-

tologies of a large domain or similar purpose and put

these terms in the appropriate hierarchical space de-

ﬁned by the upper level ontology they extend. This

provides a stepping stone for domain level ontologies

through allowing them to share more speciﬁcally re-

lated terms with other domain level ontologies and

increasing the potential for reuse that upper level on-

tologies offer. In the context of our problem we might

say that our domain or purpose is data collection and

we seek to deﬁne terms to allow data collection re-

gardless of domain. In this case the deﬁnition of mid

level ontologies is what we are interested in. Thus we

ascertain that our problem centers around the creation

of a mid level ontology that serves to deﬁne the data

collection domain. We argue that the creation of a mid

level ontology for the purpose of data collection will

help foster reuse and enable faster creation of domain

level ontologies that collect data through instances.

In this paper we present our idea for what a mid

level Data Collection Ontology (DCO) will look like,

where it ﬁts in the ontology hierarchy and how it will

remain domain independent. Speciﬁcally, we will

look at particular deﬁnitions and where the DCO ﬁts

into the existing ontology framework.

Cummings J. and Stacey D.

The Mid Level Data Collection Ontology (DCO) - Generic Data Collection using a Mid Level Ontology.

DOI: 10.5220/0006497501750182

In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (KEOD 2017), pages 175-182

ISBN: 978-989-758-272-1

2 BACKGROUND

Ontologies that are designed for reuse in a more gen-

eral form will help to frame our problem and provide

terms as a starting place for our design. In this section

we discuss different types of ontologies and the level

of concern they have for reuse to determine if and

where existing designs or ontologies can be reused by

a generic data collection ontology. We then focus on

ontology categorization in the context of where our

problem is best tackled keeping in mind our desired

high level view of data collection. Finally, we summa-

rize by drawing conclusions on existing designs and

what needs to be done in terms of deﬁning generic

and reusable ontologies.

Ontological research has become widespread in

the design of information systems. Recently the de-

sire for ontologies to span and integrate different

views of a domain and even across domains has come

to fruition (Mascardi et al., 2010). The development

of these ontologies provides the opportunity for sys-

tems to integrate and become interoperable allow-

ing for information sharing (Herre, 2010) (Mascardi

et al., 2010). In this case an ontology acts as a

bridge between systems unifying information (Herre,

2010) and allowing systems to communicate through

the ontology using their standard language and mes-

sage passing techniques. Unifying data allows the

key components of one or more domains to be cap-

tured and shared among ontologies that further deﬁne

a particular domain. The ability for an ontology to

capture a particular domain is related to its viewpoint

of the world, each ontology imposes a particular view

which deﬁnes its ability to share information. This

viewpoint is therefore what we are concerned with.

2.1 Classifying Ontologies

Domain level ontologies are ontologies that seek to

capture a shared conceptualization of a particular do-

main. These ontologies contain domain speciﬁc terms

and may only be linked to a speciﬁc application

(Roussey et al., 2011). Domain ontologies are im-

portant in that they describe the type of data we seek

to capture but for our problem we may not assume

any particular domain to capture data from. A do-

main level ontology, however, could represent the end

product for a system using our ontology.

A core ontology is linked to a particular domain

but has the advantage of providing several view-

points relating to different user groups (Roussey et al.,

2011). Core ontologies are often the result of several

domain level ontologies mapped together (Roussey

et al., 2011). Core level ontologies represent a higher

level of term generality as they seek to span and

provide deﬁnitions for a wider domain or domains.

From our perspective core level ontologies are cer-

tainly closer but still maintain the requirement of do-

main speciﬁc content within them and cannot generi-

cally be applied to any domain.

Foundational or upper level ontologies can be

summed up with the following deﬁnition: a founda-

tional ontology seeks to provide deﬁnitions and terms

that are general to all domains. (Mascardi et al.,

2007). They serve as a building block for future on-

tologies by enabling reuse since they deﬁne common

terms that will be contained by domain level ontolo-

gies. The goal of an upper level ontology is to avoid

the redeﬁnition of common terms to allow for easier

and consistent reuse of deﬁned terms. In other words

they provide a single agreed upon deﬁnition of terms

(Mascardi et al., 2010) (Roussey et al., 2011). More

importantly however is the fact that they are designed

to support all domains which differs from core or do-

main ontologies that only deﬁne terms for their par-

ticular domain, likely choosing speciﬁc (overloaded)

deﬁnitions over general deﬁnitions (Roussey et al.,

2011).

A mid level ontology seeks to provide a bridge

between an upper ontology and a domain level on-

tology by providing terms that will be common to

several domain level ontologies or areas of domain

level ontology (Ceusters and Smith, 2015). There-

fore mid level ontologies serve a similar purpose to

the upper level ontology by preventing term redeﬁ-

nition and providing consistent relationships but at a

more speciﬁc level. This has several advantages in

addition to avoiding redeﬁnition, ﬁrstly, it provides

a common understanding between derived ontologies

through similar terms, structures and relations, sec-

ondly, it provides a more streamlined starting place

for those new to the construction of ontologies by pro-

viding terms more closely related to their domain than

that of upper level ontologies. In terms of an ontology

category hierarchy the mid level ontology falls in the

middle with domain level ontologies extending mid

level ontologies and mid level ontologies extending

upper level ontologies. The full hierarchy can be seen

in ﬁg. 1.

One might then wonder why the work put into

the development of upper level ontologies has not re-

sulted in a common ontology that is shared among

all domain level ontologies. One particular reason for

this is down to implementation, where languages im-

plemented by computer scientists are based on set the-

ory that captures abstract content well but does not

capture the concrete objects and their relationships

well enough to be completely generic (Degen et al.,

Upper Level Ontologies

Mid Level Ontologies

Core Ontologies

Domain Level Ontologies

Application Ontologies

General terms for all ontologies to be

based off. Enforces high level structure.

Bridge the gap between generic upper

level terms to domain level terms.

Core ontologies define multiple view

points or multiple domains.

Ontologies that define the view a

particular domain has of the world.

Local or Application ontologies define a

view speci fic to a particular application

Figure 1: Ontology Classiﬁcation Hierarchy.

2001). More recently the author of the General For-

mal Ontology (GFO) stated that we may not be able

to meet such a lofty goal at all (Herre, 2010). How-

ever, for our purposes this is still acceptable since we

must work with what is available. With that in mind

we will focus on available upper level ontologies to

seek a design that best meets our needs. All of our

ontologies are sourced from Mascardi et al (Mascardi

et al., 2007) due to their capturing of relatively recent

and active implementations. For evaluation we will

deﬁne a criteria that will help us to draw conclusions

based on upper level ontology design, purpose, and

applications.

The ﬁrst criteria we deﬁne is based on the num-

ber of terms and relations in the ontology, where we

prefer to have fewer of each for two main reasons.

Firstly, upper level ontologies are meant to be derived

into a domain level ontology and thus will have more

terms and relations added over time and, large ontolo-

gies introduce performance penalties potentially re-

sulting in an ontology that is intractable for a reasoner

(Horrocks, 2005). Secondly, in terms of understand-

ability the fewer terms a person must know to use an

ontology the easier it is to get started. Furthermore,

large ontologies may deter usage of the ontology al-

together.

The second criteria we care about is usage and

popularity. Popularity of an upper level ontology is

important when considering its purpose of unifying

ontologies. Also, an upper level ontology must be free

from any domain speciﬁc terms or relations. Finally,

we are not interested in ontologies that take the role of

deﬁning thousands of terms to satisfy a large number

of domains since it is unlikely such an ontology could

satisfy each domain realistically.

Ontologies considered include: the Basic For-

mal Ontology (BFO), the General Formal Ontology

(GFO), a Descriptive Ontology for Linguistic and

Cognitive Engineering (DOLCE), and the Suggested

Merged Upper Ontology (SUMO).

For the sake of brevity we will focus on our choice

based on the criteria deﬁned above, although all ver-

sions were evaluated and its likely a case could be

made for any of the above upper level ontologies. Our

choice was the Basic Formal Ontology (BFO) which

is an an upper level ontology with development start-

ing in 1998 (Mascardi et al., 2007). It currently con-

sists of 35 classes in version 2 making it relatively

small (Bas, 2016). BFO is commonly applied in the

biology domain but does exist in a number of other

domains and is used in over 150 ontologies as of

this writing (Bas, 2016). BFO itself contains no do-

main speciﬁc content, and focuses on describing ob-

jects through time and space which is common to all

physical objects. It considers both abstract and con-

crete terms and seeks to deﬁne terms based on their

lifespan as either occurrent or continuant, with occur-

rent deﬁning objects that exist during a period of time

while continuant objects exist throughout time (Bas,

2016).

In terms of our criteria, BFO does well through its

deﬁnition of only high level concepts involving time

and space, maintaining a small size making the ontol-

ogy suitable for additional development and for rea-

soning. In terms of popularity and usage we examined

resources on the web to see how many ontologies cite

themselves as using each considered upper level on-

tology. On the BFO web page they cite well over 150

ontologies or projects using BFO (Bas, 2016). The

important point here is that the projects stem from

more than just the biological ﬁeld which was not the

case for other entries.

Based on our deﬁned criteria, BFO fairs best

which is why we will focus on discussing BFO and

why it best ﬁts our needs. The Basic Formal Ontol-

ogy is in its second major version therefore we will

focus on that version in discussion of terms and struc-

ture although the ﬁrst version is quite similar (Bas,

2016).

2.1.1 Mid Level Ontologies

BFO provides a starting point for the creation of an

ontology but does not give direction about where to

stop development which can go to various levels (see

section 2.1). We propose a mid level ontology as the

design target for the problem and in this section dis-

cuss why that choice best reﬂects the problem, our

deﬁnition of ontology, and works with the chosen up-

per level ontology. We will start at examining how

the problem ﬁts with this design as well as discussing

downsides to the design.

Mid level ontologies seek to deﬁne a domain that

is at a very high level and span multiple ontologies.

They therefore are generally independent and deﬁne

terms at a high level to avoid conﬂict with ontologies

that will extend them. This ﬁts well with our prob-

lem since it is expected that our solution will form

the basis of a domain ontology but not be exhaustive

in term deﬁnition. Second in terms of our deﬁnition

they seek to deﬁne terms as generically as possible

but also while avoiding redesigning existing ontolo-

gies and redeﬁning terms.

Another important part of mid level ontology

compatibility is the source ontology. The OBO

foundry provides the framework and existing ontolo-

gies that are developed using BFO and demonstrates

existing mid level ontologies that are active. This

demonstrates merit to the proposed pattern as it pro-

vides concrete examples functioning with the Basic

Formal Ontology (WG, 2017). Furthermore, the Ba-

sic Formal Ontology does not have a derived ontol-

ogy that exists for this particular problem demonstrat-

ing a gap in existing mid level ontologies into which

our solution could ﬁt. The design of BFO has taken

into consideration mid level ontologies with working

examples of mid level ontologies and domain level

ontologies utilizing those mid level ontologies (WG,

2017).

3 ONTOLOGY DESIGN

This section is dedicated to an overview of the Data

Collection Ontology (DCO), its components, rela-

tions, and design choices that make it suitable for data

collection. The DCO is designed as a mid level ontol-

ogy that extends the Basic Formal Ontology (BFO)

to organize and provide placement for data collection

terms. The DCO seeks to provide domain indepen-

dent deﬁnitions as a starting place for domain data

collection developers. Due to the fact that the DCO

is a mid level ontology and is in its early design it is

by no means ﬁnished and is expected to change over

time. Like an upper level ontology it may be found to

be incorrect or lacking and will need to be updated.

With its purpose in mind we start by noting the

design intentions in other words: how it is expected

to be used by domain ontology developers. We then

move on to discussing the components and relations

of the ontology to understand why particular compo-

nents exist and how they contribute to the intended

use of the ontology.

3.1 Design Intentions

The design intentions are an important place to start

since they set the basis for how one is expected to

use the DCO. The design has a philosophy about how

data collection should be performed and does so at a

high level allowing for more speciﬁc work ﬂows to

be integrated. This view is based on ﬁrst describing

what you are collecting; these are subjects which rep-

resent a timeless view of your object. The DCO uses

BFOs independent continuants to deﬁne subjects that

describe objects as they are in concept but not as an

instance that exists in time and space. Instead, your

captured data are represented as instances and have a

type of the subject. DCO also includes processes to

capture how data is collected and what stages it goes

through. This is common in data collection activities

such as surveys or cyclic forms of collection. Addi-

tionally DCO places stress on types and units through

the deﬁnitions of datums that capture both measures

and units of measure ensuring all values are labelled

appropriately. The ﬁnal portion of the ontology con-

sists of classiﬁers that are entities in BFO since they

are time and space irrelevant and may be used to clas-

sify any type. In this case it was felt that classiﬁers

should not be restricted to time or space due to their

function of classifying any type.

Classiﬁers are hierarchies of terms over which

one deﬁnes equivalence relations to deﬁne what con-

stitutes this particular category. Classiﬁers are de-

signed around the suspicions or anecdotal estimates

of what range one expects data to fall into. Clas-

siﬁers are designed to be populated with instances,

which exist as individuals of any type in the ontol-

ogy. These instances are then grouped based on the

reasoner and can be queried to determine if they are

of the expected type when entered in the ontology. In

other words, it allows validation of the estimates or

anecdotal data one has. Classiﬁers provide additional

advantages concerning data validity in that they are

non-destructive whereas traditional approaches may

place strict boundaries on collected data, removing

instances that do not ﬁt. Classiﬁers allow invalid or

inconsistent data to be ﬁltered but not permanently re-

moved if an inconsistency in classiﬁcation is detected.

This supposes there is a dynamic aspect of the ontol-

ogy that over time will be shaped by the instances that

it collects and that deﬁnitions will be challenged.

3.2 Ontology Components

With the high level view of DCO established we can

further break down its main components: Subjects,

Processes, Data Qualities, Classiﬁers, and Meta Data.

In this section each of these terms are deﬁned. We

then move on to using the components with the de-

ﬁned object and data properties to form a working ex-

ample of the DCO. Due to the main premise of the

DCO, or any mid level ontology, the DCO only de-

ﬁnes terms at a relatively high level.

Table 1: Object Relations.

Relation Description

has part Allows individuals to be com-

posed of other instances. This

is important where data is cap-

tured on different parts of a

larger item or data is aggregated

into a larger sum. Composition

should not be thought of only in

terms of physical objects having

parts.

has measure Measurements are considered

any numerical value one cap-

tures and links to an individual.

Note that this is a object prop-

erty so it forces one to link to

some descriptor for the value.

Subclass of has measure

has measure-

ment datum

This will be one of the most

common properties as it links

measurement datums to individ-

uals so data is annotated with

units.

has measure-

ment unit

This provides a link for unit

deﬁnitions to measurement da-

tums.

has time stamp Links a time value to some mea-

surement datum that contains

some time unit allowing a uni-

versal way to save time in an on-

tology.

3.2.1 Classes

Subjects represent what data is being collected from

or about. Subjects can be either physical objects

or concepts, meaning types can be either material

or immaterial. Subjects are designated as indepen-

dent continuants, meaning they should only repre-

sent high level subjects. For example, if we are sur-

veying people then the subject may be a person and

we would deﬁne person at a universal level while in-

stances may have a relationship with the person sub-

ject, i.e. part to Person but are themselves occur-

rent and do exist in space time. Additionally, through

BFOs Role class, one can assign the roles that Sub-

jects may operate in .

Processes fall under the BFO deﬁnition with ex-

tensions provided by DCO for convenience and al-

low one to support both state driven and independent

processes. State driven processes require one process

block to ﬁnish before another can start while indepen-

Table 2: Object Relations Continued.

Relation Description

contains

process

Links a process to an object. For

example, some subject may go

through some process that data is

captured on. The data collection

may itself be a process and have a

relation to another process.

has quality Allows instances to possess partic-

ular qualities or require particular

qualities on data being classiﬁed.

has object

control

Used for objects that act as a con-

trol. For example, in a process

something may be a terminator.

branches to Supposes that an instance in a pro-

cess will branch to another instance

when it has completed. Allows for

order to captured.

dent processes can have any number of process blocks

running concurrently.

Classiﬁers are where equivalence relations are de-

ﬁned to classify instances in your ontology. Classi-

ﬁers are where one would normally deﬁne the range

that data is expected to fall into so as to form a par-

ticular category or type. One may think of a classiﬁer

as having the ontology assign a type to an individual

based on its understanding. Classiﬁers can be thought

of as the dynamic component of the ontology. They

are designed to change as data may prove them to be

invalid or individuals may change if they are proven

invalid based on the ontology’s view of the world. For

an example of how classiﬁers look see Table 2.

Meta Data are descriptors that exist to deﬁne a

data point or complex structure that one expects an

individual to contain. Meta data describes the types

and units that data will exhibit allowing one to cap-

ture data in multiple formats and multiple units but

have it link to the same individual type without caus-

ing confusion later. An example case of this would

be if a study were conducted across North America

where in Canada the metric system dominates while

Americans use the imperial system. Data could be

captured for the same study using different meta data

classes to describe the units.

Data Qualities exist to deﬁne restrictions and set

theory properties on instances that can be used as a

part of the classiﬁers to group instances or as a part

of a larger system. Examples of data qualities include

boundedness, cardinality, and equality.

Classifier

ClassiferEX1

ClassifierEX2

Equivalency Relations

InstanceEX1 InstanceEX2

ClassifierEX1"

Has_expected_type

Classifiers group instances

based on equivalency

relations using a reasoner

where the has expected type

serves to store the classifier

you expect an instance to be

grouped under

Based on the has_expected_type

one can see where there is a

discrepancy between the reasoned

type or t he ontology view of t he

world and the expected view

highlighting an error on either side

Figure 2: Classiﬁers.

3.2.2 Relations

In addition to the classes deﬁned, DCO deﬁnes sev-

eral high level relations that are designed to be sub-

classed and added to. Object relations are summed up

in Tables 1 and 2 with data relations being summed

up in Table 3.

4 WORKING EXAMPLE

As an example of how the design works we will con-

struct a very basic ontology around collecting vehi-

cle performance data with a goal of comparing the

consistency of output ﬁgures against other instances

(vehicles) of the same type. The ﬁrst subject of our

collection will be Vehicle which is the most generic

object. The Vehicle subject will describe what a vehi-

cle is composed of from the performance perspective

as this is the view our ontology has of the world. For

example, every vehicle has an Engine and a Transmis-

sion so we will deﬁne those as other subjects since we

are interested in these components as they alter a vehi-

cles performance substantially. Our last subjects will

be the Make and Model since we need to compare

like vehicles and therefore need to know who manu-

factured them.

Now we deﬁne the relations between our subjects.

Vehicles are made up of an Engine and a Transmission

so we can use composition to deﬁne a Vehicle having

those parts. DCO deﬁnes the part of relation which

allows us to produce a composite relationship. Addi-

tionally, models are produced under some Make and

Table 3: Data Properties.

Relation Description

has expected

property

Denotes what property this

value is intended to represent.

This is designed primarily for

external use where a value may

link with a variable.

has expected type Denotes what type we expect

an instance to be. This is in-

tended to be used in conjunction

with classiﬁers allowing ontol-

ogy veriﬁcation based on ex-

pected types. It is additionally

intended to be used to link to

external systems where we want

an instance to link to a particular

type.

has control Represents data values that act

as controls such as booleans that

alter the ﬂow of a process.

Subclass of has control

can repeat Denotes whether a particular

entity can repeat such as a pro-

cess block. Some processes

may be cyclical.

has sequence Denotes a sequence value that

may be used to order process

blocks or other entities.

has value The base compositor for values

allows an instance to be com-

posed of particular values.

Subclass of has value

has coordinate

value

Used for denoting the location

of instances.

has maximum Represents a maximum ex-

pected value for an instance to

have; good for creating ranges.

has minimum Represents a minimum ex-

pected value for an instance to

have; good for creating ranges.

has time value Links time values to instances.

Note that format is independent

and can be any type based on the

ontology design.

has percentage Values can be captured as per-

centages.

has measurement

value

Used to link measurement val-

ues to measurement datums.

we can consider them part of what a company pro-

duces. For our example, we can use the has part for

Vehicle to the Engine and Transmission and we can

subclass part to to include example of for Vehicle to

Model while adding produces to has part to state that

a manufacturer produces Vehicles and Models.

Moving on to the data we would like to capture,

we will deﬁne measurement datums that will capture

key performance points for a vehicle. In this sim-

ple example we would like to capture the power and

torque the vehicle produces so we will deﬁne some

common units. Power is measured commonly us-

ing horsepower and kilowatts while torque is mea-

sured commonly using foot pounds and newton me-

ters. These are deﬁned as instances under Power Unit

and Torque Unit measurement units respectively. Fi-

nally we create datums for power that requires a nu-

merical value and some power unit as well as torque

that requires a numerical value and some torque unit.

With these measures deﬁned we will say that a sub-

class of an Engine requires at least one of each mea-

sure using has measurement datum relation. Datums

are also deﬁned for fuel economy in a similar way

with fuel economy units and a Vehicle having mea-

sures for city, highway, and combined fuel economy.

The design can be illustrated as seen in ﬁg. 3

where datums and units are deﬁned as well as subjects

linking to their respective datums. This is the general

structure expected for data that is to be collected on

subjects.

Now since we are capturing data on vehicle per-

formance we may deﬁne classiﬁers that are based

on estimations of what we expect. For this example

let us say we are verifying vehicles are within their

rated power measurements so we will deﬁne classi-

ﬁers based around manufacturer provided power and

torque ratings for a particular vehicle and apply some

expected variance to create boundaries. These clas-

siﬁers will use range values around the power and

torque measurement datums we just deﬁned. This al-

lows us to create classiﬁers around a particular model

using values for the Make and Model as well as ranges

for power and torque for a particular engine to group

our vehicles. The greatest importance here is when

populating the ontology we must use the has expected

type relation and link to the corresponding make and

model classiﬁer to allow the data and the ontology to

be validated. This is done by using the reasoner to

add the type of our added instances and then querying

the reasoner for the intersection of instances that are

not the same type as the has expected type URI value

stated. In other words, this presents there is an error

with the vehicle that data was captured on or that the

ontology has an inaccurate view of what values are

valid for that vehicle. In creating classiﬁers it is useful

to assign a name that relates to what you are capturing

so has expected type values can be application or hu-

man generated very easily. For example, in our case

we might name a classiﬁer DodgeRam57Auto to de-

note that we expect this classiﬁer to group all Dodge

Ram pickups with the 5.7 litre engine and automatic

transmission making it easy to to generate the URI

with the data we use to populate an instance.

5 CONCLUSIONS

We have presented our Data Collection mid level on-

tology (DCO) which is a mid level ontology provid-

ing domain agnostic data collection terms and provid-

ing reasoning capability for derived ontologies. Based

upon BFO, this ontology provides an entry point for

OBO foundry ontologies that already make use of

BFO as their source. Additionally through the use of

the BFO and deﬁning high level terms the DCO will

achieve its goal of domain agnosticism.

The use of a mid level ontology for data collection

provides several key advantages over domain level

designs starting with deﬁning common terms that all

data collections ontologies will require in some form.

Secondly, it sets up ontologies for reasoning and uses

the reasoner for much of the typing and labelling of

data to allow for automated error checking and error

prevention.

In a deeper context the DCO provides greater ben-

eﬁt to systems at large though its classiﬁers which

provide two key beneﬁts. Firstly, they allow data in-

consistencies to be caught through comparing the as-

signed type to the has expected type where the classi-

ﬁer is known to be good. Secondly, they allow esti-

mated ranges through the ontologies world view to be

validated. They may be proven to be invalid when the

assigned type and has expected type do not match but

a datum is known to be good. This allows the ontol-

ogy to help perform “error detection” in an external

system through the use of classiﬁers. Secondly, it can

allow external systems to resolve inconsistency on the

ontology itself, creating a dynamic ontology design.

Finally, the existence of a mid-level data collec-

tion ontology based on BFO also serves the existing

ontological community using the BFO ontology who

have a domain ontology and are interested in adding

data collection to their domain ontology or are look-

ing to classify existing ontology instances.

For the full Data Collection Ontology, please

email either of the authors who will provide it to you

in various formats.

Vehicle Engine

Measurement

Datum

Has_part some Engine

Power

Datum

Measurement

Unit

Horsepower LB/FT l/100km

Power Unit

Fuel

Economy

Datum

Has unit some Consumption Unit

Subject

Subjects

Measurement Units Defined as

Instances

Measurement datums are

defined for the measures

that expected to be

captured from subjects.

They serve to link values

with some unit type.

Two Subjects are

defined for fuel

economy in this case,

the engine and a

vehicle that data will be

captured from.

Measurement types

are divided into broad

categories: power units

for engines and

consumption for

vehicle fuel economy

Consumption

Unit

Has Unit exactly 1 Power Unit

Has Measurement Datum exactly 2 power datums

Has Measurement Datum exactly 3 Fuel Economy Datum

Figure 3: The Vehicle Ontology.

REFERENCES

(2016). Basic formal ontology (BFO) — users.

http:// ifomis.uni-saarland.de/bfo/users. (Accessed on

02/12/2017).

(2017). Basic formal ontology. http://www.obofoundry.org/

ontology/bfo.html. (Accessed on 03/26/2017).

Ceusters, W. and Smith, B. (2015). Aboutness: To-

wards foundations for the information artifact ontol-

ogy. pages 1–5.

Degen, W., Heller, B., Herre, H., and Smith, B. (2001). Gol:

A general ontological language. In Formal Ontology

and Information Systems. Citeseer.

Gruber, T. R. (1995). Toward principles for the design of

ontologies used for knowledge sharing. Int. J. Hum.-

Comput. Stud., 43(5-6):907–928.

Herre, H. (2010). General Formal Ontology (GFO):

A Foundational Ontology for Conceptual Modelling,

pages 297–345. Springer Netherlands, Dordrecht.

Horrocks, I. (2005). Description logics in ontology applica-

tions. In International Conference on Automated Rea-

soning with Analytic Tableaux and Related Methods,

pages 2–13. Springer.

Mascardi, V., Cord

ı, V., and Rosso, P. (2007). A comparison

of upper ontologies. In WOA, volume 2007, pages 55–

64.

Mascardi, V., Locoro, A., and Rosso, P. (2010). Automatic

ontology matching via upper ontologies: A system-

atic evaluation. IEEE Transactions on Knowledge and

Data Engineering, 22(5):609.

Roussey, C., Pinet, F., Kang, M. A., and Corcho, O. (2011).

An introduction to ontologies and ontology engineer-

ing. In Ontologies in Urban Development Projects,

pages 9–38. Springer.

WG, O. T. (2017). The OBO foundry. http://www. obo-

foundry.org/. (Accessed on 03/26/2017).