DATA INFRASTRUCTURES IN AGRICULTURE

Attempts at Interoperability

Daniel Martini and Mario Schmitz

Association for Technology and Structures in Agriculture (KTBL), Bartningstraße 49, Darmstadt, Germany

Keywords:

Agriculture, Interoperability, Service infrastructure, Web oriented architecture.

Abstract:

Agriculture presents itself as an interesting conglomerate of different domains. It is an intersection of a bunch

of natural sciences like biology, chemistry, geography with business, legal and political issues. As diverse

as the disciplines involved in agriculture are the demands on data management and exchange. This poses a

special challenge on interoperability of data formats and services. Additional requirements arise from the size

and structure of enterprises involved in farming and in provision of farm management information systems.

From our work in agricultural data exchange, we present experiences and try to derive how data standards

touching this domain should develop to allow for future interoperability.

1 INTRODUCTION

Whereas other industries can produce under mostly

controlled conditions in closed systems like factories,

agriculture is embedded into a natural, economic and

political environment with diverse inﬂuencing factors,

some of them only partly well understood. Farm man-

agement today is the art of correct interpretation of a

variety of information from different sources describ-

ing this environment. Electronic data acquisition and

analysis play an increasing role for the farmer. How-

ever, data has to be retrieved from a number of places.

It has to be handed on to parties imposing differ-

ent requirements on structure or format than the ones

present during original production of the data. Differ-

ent analysis methods demand for different points of

view and aggregation of data. All this leads to ma-

jor challenges in interoperability and on how to build

service infrastructures in agriculture. Current state-

of-the-art is far from facilitating data exchange for the

farmer. There are diverse standards playing a role in

agriculture. A lot of the data items present are not

easily mapped from one to another either due to ﬁne

semantic differences or inconcise conceptualizations.

As the farm management information systems devel-

opers cannot judge on these issues themselves either,

most decisions are left to the user leading to a lot of

human interaction in advance of a data exchange.

In this paper, we present the types of data, stan-

dards, requirements on service infrastructures and po-

tential uses of semantic technologies and ontologies

in agriculture. It is by no means an exhaustive sci-

entiﬁc analysis of the problem. Rather, the paper

shows experiences gained during standardization of

agroXML, an XML based standard in agriculture in

Germany. We will derive the need for action and pos-

sible further activities in the area of knowledge acqui-

sition and management in the agricultural domain.

2 TYPES OF DATA AND THEIR

SOURCES

2.1 Geographic Data

The ﬁeld is the central production unit in arable farm-

ing. Even the basic geometry of the ﬁeld has to be

considered to achieve optimum performance during

production. The decision into which direction to till

has a major inﬂuence on the economical outcome of

work procedures. New methods in farming like e. g.

precision farming continue to deliver an ever increas-

ing amount of spatially referenced data. Harvested

amounts per small scale area units are determined au-

tomatically by sensors in combine harvesters and later

on produced into maps. Future measures in the ﬁeld

can be ﬁne-tuned according to results gained from

analysing these maps. In some cases, an overlay with

further geographic data like e. g. remote sensing data

287

Martini D. and Schmitz M. (2009).

DATA INFRASTRUCTURES IN AGRICULTURE - Attempts at Interoperability.

In Proceedings of the International Conference on Knowledge Engineering and Ontology Development, pages 287-292

DOI: 10.5220/0002308202870292

 SciTePress

or aerial imagery is necessary.

A lot of representational aspects of spatial data

are standardized by the Open Geospatial Consortium

(OGC). The Geography Markup Language (GML,

(Cox et al., 2004)) provides basic data types for spa-

tial objects like polygons, points and linestrings. The

standard is well designed allowing reuse of parts in

other XML based vocabularies using either proﬁles

or imports of necessary schema modules. However,

as such, it is a meta-standard. It does not say any-

thing about how to represent certain real world ob-

jects. If GML is to be used in a service, develop-

ers usually provide their own schema following the

feature-property model mandated by the speciﬁca-

tions. While this ensures uniﬁed modelling proce-

dures, it does not provide for any kind of (automated)

interoperability between different OGC conformant

web services. Semantic interpretation and agreement

on the content delivered is up to the developers of the

systems which are to be connected.

2.2 Data on Operating Supplies

Operating supplies used in agriculture have a variety

of properties of interest to the farmer. In some of the

cases, this information is provided as a printed ac-

companying sheet. Seeds for example have to have

variety information and other quality parameters like

germinating capacity determined and printed onto la-

bels accompanying product batches. However, there

are also cases, where additional information might be

retrieved in digital form. Plant variety ofﬁces often

have further data on average quality parameters de-

rived from ﬁeld trials. In Germany for example, this

information is available in downloadable comma sep-

arated value ﬁles.

In pesticide usage, the farmer has to follow appli-

cation rules. In most cases, properties of pesticides

necessary to correctly implement these rules like e. g.

the waiting period between application and harvest,

have to be published by agencies after tests and an

approval procedure. While this information can often

also be retrieved via the internet, it is in most cases

embedded in a poorly structured form into web pages

not ready for automated extraction and reuse in agri-

cultural software systems.

Roughly the same goes for veterinary drugs. Ap-

proval and application of pharmaceutics are strongly

regulated in Europe. A lot of data is available through

European Medicines Agency (EMEA), however doc-

uments are published in a variety of more or less

structured and standardized formats (Microsoft Word

or PDF ﬁles amongst them).

All in all, data as such is readily available. Lack-

ing are however machine-readable forms allowing for

reuse in decision support systems.

2.3 Laboratory and Animal Data

Results from laboratory analysis present a major part

of data in plant production (e. g. soil analysis) and

livestock farming. Milk recording and analysis is an

area, where mass data acquisition is quite common.

Using these data, the breeding value of single animals

can be determined by comparing to a large number

of individuals in the respective population. The algo-

rithms are complex and calculations are mostly done

as a service to farmers by computing centres run by

breeding associations. Data exchange for these ser-

vices is supported by a set of standards: ISO-17532

(ISOagriNet, (Internationals Standards Organisation,

2007)) provides the underlying protocol, whereas the

Agricultural Data Element Dictionary (ADED) pro-

vides the content. The data dictionary is separated

into a part providing data entities for exchange on the

national level and into another part providing inter-

nationally harmonized entities. There are two differ-

ent serializations for the format: ADIS/ADED and

XML/ADED. The protocol is record oriented and

tuned for bulk transfer of large amounts of data. As

such, it is not yet ready for easy integration into more

web-oriented data networks using standard internet

technologies. On-going work however tries to achieve

a more friendly serialization in the face of current de-

velopments in data exchange.

The problems faced by developers and users of

these standards currently mostly lie in the semantics.

The data dictionaries consist of a lot of entities and

items, and it is often difﬁcult to judge, if the respec-

tive use case has already been worked on by another

group and if there are already items available, which

could be used. Also occuring is the case, that the same

item is used with different meanings in different con-

texts. It is currently not quite clear, how to solve these

issues.

2.4 Business Data

Apart from the work in the ﬁeld or in the stable, the

farmer wants to generate an income for himself, his

workers and his family. In other words: he is involved

into economic activity and conducting business. Stan-

dards available in the ﬁnancial sector are numerous.

However, in a lot of cases, they have a very clear and

limited scope. The Home Banking Computer Inter-

face (HBCI), the newer FinTS, Interactive Financial

Exchange (IFX) or Open Financial Exchange (OFX)

for example allow for conducting ﬁnancial transac-

KEOD 2009 - International Conference on Knowledge Engineering and Ontology Development

288

tions using a Personal Computer and a suited appli-

cation program. The interface standards are probably

of no larger relevance to the knowledge management

on a farm, as messages are short lived and are mainly

used to change or request state of bank accounts and

depots.

Completely different in this regard is the eXtensi-

ble Business Reporting Language (XBRL). It is a typ-

ical XML application in the sense that it is document

oriented and allows linkage between different sec-

tions of documents using standard technologies like

XLink. The scope is limited to business reports and

its use cases are well deﬁned.

On the other hand, there are standards in the busi-

ness sector, which try to standardize on a syntax for

each and every use case one can think of. Within

thescope of UN/CEFACT standardization a number

of working groups are modelling use cases in all kinds

of domains, also in agriculture. The UN/CEFACT

XML Naming and Design Rules (Heilig et al., 2006)

derive in many ways from best practices recom-

mended by the W3C. It is very difﬁcult to reuse

UN/CEFACT schemas together with other XML vo-

cabularies or the other way round. This is mostly due

to the fact, that most of the commonly used extension

mechanisms (like e. g. inheritance by extension, any-

Types etc.) are forbidden in the speciﬁcation.

2.5 Supply Chain Data

As soon as primary production on the farm is ﬁn-

ished, the goods enter the food chain. There again,

another set of standards becomes important, the ones

to identify and describe products in a supply chain.

The EPC bar codes are probably known to a broader

public from the local discounters cashier scanners.

However, the organization behind them, GS-1 also

produces standards for information systems and ser-

vices revolving around the product identiﬁcation like

e. g. Electronic Product Code Information Services

(EPCIS, (EPCglobal Inc., 2007)). These standards

are also limited in scope, allowing for representation

of events and basic object data in the supply chain.

They are well suited to use cases in traceability of

agricultural goods, however difﬁculties to represent

events like distribution of larger amounts into smaller

batches (e. g. with bulk materials like cereals or with

carving up of animals after slaughter) or putting to-

gether ingredients to form another product prevent us-

age in some areas.

3 PROPOSED SERVICE

INFRASTRUCTURE

3.1 Requirements

In our experience, the most important requirement for

an information exchange infrastructure for agriculture

is simplicity. Most of the companies involved in pro-

duction of agricultural software are small and medium

sized enterprises. They can not afford to spend a lot of

resources on implementation of overly-complex pro-

tocols and standards. It is a major challenge to work

out the essential parts in standards and limit the num-

ber of degrees of freedom by providing a very generic

and simple set of speciﬁcations.

Another aspect e. g. in food traceability is (close

to) unlimited scalability. A lot of partners are involved

in the food chain from farm to fork. It has to be easy

to add further systems and data models to an infras-

tructure.

Data security is also a major demand from the

agricultural community, at least in Europe. Data on

agricultural processes are treated as business secrets.

This is understandable from the viewpoint that the

buyer of agricultural products might use this informa-

tion to control prices to be paid to the farmer.

3.2 Possible Architecture

During a research project we were able to develop

a prototype fulﬁlling a large part of the above men-

tioned requirements.

The design of the system is based on a web-

oriented architecture (Jacobs and Walsh, 2004). It re-

lies on the usage of technology components standard-

ized by the W3C. Key concepts are globally unique

identiﬁcation by URIs, web linking methods, XML

as format for the content and a simple protocol, re-

stricted to a few method invocations - currently the

Hypertext Transfer Protocol.

The content and document types delivered are

provided by agroXML. It offers the necessary ele-

ments and datatypes to be able to represent agricul-

tural issues in XML documents. agroXML is deﬁned

by a set of XML Schemas and content lists. They

are available at http://www.agroxml.de/schema/ and

http://www.agroxml.de/content/ respectively.

Unique identiﬁcation of resources is provided by

URIs. We have used the subset commonly known as

URLs, as the mechanism to dereference them is sim-

ple and standardized. To link documents in a service,

XLink (DeRose et al., 2001) is used.

Putting everything together, a ReSTful web ser-

vice is built. The term ReST is an acronym for Repre-

DATA INFRASTRUCTURES IN AGRICULTURE - Attempts at Interoperability

289

sentational State Transfer and has been introduced by

Roy Thomas Fielding (Fielding, 2000). It is based on

the assumption that with a few simple operations to

read and write data and a system changing its state

depending on the operations issued, any use cases

in communication can be represented. Variations of

this concept are a basic thread present in information

technology history up to now. The Turing machine

(Turing, 1936) already relied on this simple principle.

The SQL language common in database systems with

its INSERT, SELECT, UPDATE and DELETE oper-

ations is built on this pattern. One of the principles in

early UNIX system development was ”everything is

a ﬁle”, thus allowing for manipulation of devices and

compute resources using simple ﬁle operations like

open, close, create, read and write. Later on, Kilov

coined the term Create-Read-Update-Delete-pattern

(CRUD, (Kilov, 1990)). The currently widespread

Hypertext Transfer Protocol (HTTP, (Fielding et al.,

1999)) is built around this assumption as well.

Developing a ReSTful web service involves dis-

tributing state and functionalities of a service across

a set of network objects. Objects are manipulated us-

ing only the small set of basic operations provided by

HTTP. This is in contrast to services using an RPC-

paradigm, where a single network object offers a large

number of method invocations.

For demonstration purposes, a prototype around

the following use case has been developed:

• For a group of animals the fattening process is

over.

• The farmer puts together a batch for transport to

the slaughterhouse.

• Data about the animals is readily available.

• Data about the location, where the batch is built,

is readily available.

• The task is to summarize the information about

this batch and present it in a machine-readable

form on the network.

The modules available in agroXML allowed for

building small, self-contained documents able to rep-

resent single objects involved as resources on the web.

For pigs, data concerning sex, eartag and events

related to the animal like weighing or feeding were

laid out in XML instances.

For the prototype, we could rely on unique iden-

tiﬁcation of single animals. Nevertheless, it is also

possible to address groups of animals, if there are no

unique identiﬁers available. As a drawback, in the

latter case no data concerning a single animal can be

given.

It is also possible to represent basic information

about the farm in agroXML instances. The objects

Farm and Pig become resources on the web by as-

signing URLs to them. To be able to build a batch,

the objects of which it is comprised must be linked in.

This is achieved by allowing XLinks from the batch

XML instance to point to the single animals contained

within and to the farm. The modelling is generic, so

that other objects might be batched as well by just

linking to them.

In total, the following URL-structure is used for

the service:

farm data: http://example.com/farms/*

animal data: http://example.com/animals/*

batch data: http://example.com/charges/*

While in the example, all network objects in the

service are available on a single domain, the Xlinks

can in principle point anywhere. It is possible to in-

tegrate data offered on other servers of external infor-

mation providers into local applications.

The application shown could also have been built

using message-oriented remote procedure calls using

e. g. SOAP. For this to work, methods for adding pigs

to a batch and for retrieving batch data would have

to have been deﬁned. However, as SOAP messages

are short-lived and as there is no standardized way to

reference objects it is very difﬁcult to add further lay-

ers like e. g. a set of RDF (Ressource Description

Framework, (Klyne and Carroll, 2004)) statements to

relate resources to each other. In contrast, resources

in a ReSTful webservice are persistent from a clients

point of view. This can be the basis to build e. g.

a RDF triple store annotating resources with further

metadata or describing relationships between objects.

As URLs are the principle of identifying objects in

RDF as well as in ReSTful web services, the tech-

nologies play together quite well.

A further disadvantage of the message-oriented

approach using technologies like SOAP is, that the

method calls offered by the server and their parame-

ters have to be known to the client in advance. While

this is no problem as long as there is only a lim-

ited, strictly controlled set of services available (like

e. g. internal to an organisation), it leads to severe

difﬁculties if an unlimited set of applications is to be

added. While technologies like UDDI theoretically

could provide the necessary object and method publi-

cation functionality, in networks with a large diversity

of system environments dynamic binding of clients

during runtime is currently practically infeasible. In

contrast, in a ReSTful service network, a client can

navigate through an inﬁnitely large set of resources

without requiring detailed knowledge of the complete

set.

Tools to implement services like the one shown

KEOD 2009 - International Conference on Knowledge Engineering and Ontology Development

290

are available for a variety of hardware platforms from

energy efﬁcient mobile devices up to powerful server

machines and for almost every programming environ-

ment. Basically, only an implementation of HTTP

and an XML Parser is required. The implementation

is even possible with reasonable effort in lower level

programming languages e. g. as a server module to

the Apache webserver in C. Thus the architecture of-

fers the necessary simplicity required to work in en-

vironments like the farm, where the IT infrastructure

is not as sophisticated as in large enterprises. An ad-

ditional bonus comes from the fact, that ReSTful ser-

vices are - in contrast to services using RPC coded

in SOAP - cacheable, which allows them to be used

across high latency links with proxies and caches in

between client and server as often seen in rural areas

or developing countries.

4 USAGE OF ONTOLOGIES

4.1 Rule based Systems to Express

Legal Obligations

A rule based systems is used as a way to store knowl-

edge and to interpret information. Rules deﬁne IF-

THEN-terms and form a rule base. An inference en-

gine draws conclusions based on the rules together

with other information – the facts. That means, if a

fact is given, we can deduce one or various other facts.

New facts could be interpreted by the rules in a second

cycle. The system has to know when all stop condi-

tions are complied and the result is calculated. Facts

are information needed by the system to draw conclu-

sions. These could be data collected from databases,

user-ﬁlled forms or external resources.

An example of such an engine currently in the

works looks upon the use case of a farmer wanting

to ﬁll out a form to claim for crop subsidies. A rule

system, prepared for this procedure, could analyse all

information and help him to complete the form cor-

rectly. If he grows a crop that is supported by a spe-

cial programm, the system is able to interpret his in-

formation and point out the possibility to request this

support too.

The challenge with regard to creating these rule

sets currently are difﬁculties in transferring informa-

tion available on the farm – mostly derived as docu-

mentation of practical work processes – into a rep-

resentation according to the conceptual framework

given in the form. While there are for example code-

sets for crops available to facilitate handling by infor-

mation technology in the administration, they achieve

the opposite for the farm management information

system, in that they mangle and merge existing, sepa-

rate concepts like species, usage and receiver of a crop

or their superclasses into a single coded item. That

makes content mapping a very difﬁcult and cumber-

some process often requiring human interaction.

In the future, a sophistcated rule system might go

a step further too. It can help to take decisions e. g.

what kind of fertilizer a farmer might need for a cer-

tain variety under certain weather conditions. Based

on the kowledge of an expert, the system knows the

best answer. The application chooses all conditions

– IF-part of a rule – concerning his case und makes

the right conclusion provided all important facts are

known and the rule base is large enough.

The rules should – if posssible – be expressed in

an almost natural language. The idea behind this is

that a domain expert should be able to create rules

in a language which a computer understands with-

out knowing programming languages. This makes the

system not only user friendly, but also ﬂexible.

In such a system, the algorithms don’t depend on

the implementation or programmed code. With a rule

managing system, a domain expert can easily delete

or change old rules and add new rules to the applica-

tion.

Such a rule based system is comprised of four

main components:

1. a inference engine that interpretes a broad set of

records.

2. a sufﬁciently large knowledge base to achieve sat-

isfying results.

3. a system that aggregates information and trans-

lates them in a form usable in the inference engine

4. a user interface for creating requests and adminis-

tering the rules

Such a system can only be established based on

networking and an exchange of standarized informa-

tion, such that different islands of technologies an

controlled vocabularies merge together.

5 CONCLUSIONS

All in all, agriculture provides an interesting play-

ground for knowledge engineering methods and tech-

nologies. As agriculture has to retrieve information

from a variety of sources in different domains, inter-

operability of standards is a major issue. Currently

missing is an approach at extensibility and the doc-

umentation and implementation of best practices for

and by standards developers satisfying the demands

DATA INFRASTRUCTURES IN AGRICULTURE - Attempts at Interoperability

291

of data exchange according to the world wide web’s

paradigms. Clariﬁcations of semantics and methods

to handle small differences in semantics on transfor-

mation from one standard to another correctly are also

required in the future.

ReSTful web services provide the simplicity and

scalability required in future exchange of agricultural

data. They can be seen as a distributed dataset. A

system drawing logical conclusions from this dataset

comparable to datamining applications on relational

database systems could be built. Using technologies

like the web ontology language (OWL, (McGuinness

and van Harmelen, 2004)), rulesets could be loaded

on-demand by client applications for different pur-

poses. Expert systems providing agricultural exten-

sion services might proﬁt from such an approach.

REFERENCES

Cox, S., Daisey, P., Lake, R., Portele, C., and Whiteside,

A. (2004). OpenGIS Geography Markup Language

(GML) Implementation Speciﬁcation. Open GIS Con-

sortium, Inc.

DeRose, S., Maler, E., and Orchard, D. (2001). XML Link-

ing Language (XLink) Version 1.0. World Wide Web

Consortium. http://www.w3.org/TR/xlink/.

EPCglobal Inc. (2007). EPC Information Services (EPCIS)

Version 1.0.1 Speciﬁcation.

Fielding, R. T. (2000). Architectural Styles and the Design

of Network-based Software Architectures. PhD thesis,

University of California, Irvine.

Fielding, R. T., Gettys, J., Mogul, J., Frystyk, H., Masinter,

L., Leach, P., and Berners-Lee, T. (1999). RFC2616:

Hypertext Transfer Protocol – HTTP 1.1.

Heilig, P., Stuhec, G., Pemberton, M., and Minakawa, G.

(2006). XML Naming and Design Rules, Version 2.0

of 17 February 2006. UN/CEFACT.

Internationals Standards Organisation (2007). Stationary

equipment for agriculture – Data communications

network for livestock farming.

Jacobs, I. and Walsh, N. (2004). Architecture of the World

Wide Web, Volume One. World Wide Web Consor-

tium. http://www.w3.org/TR/webarch/.

Kilov, H. (1990). From semantic to object-oriented data

modeling. In Proceedings of the First International

Conference on Systems Integration.

Klyne, G. and Carroll, J. J. (2004). Resource De-

scription Framework (RDF): Concepts and Ab-

stract Syntax. World Wide Web Consortium.

http://www.w3.org/TR/rdf-concepts/.

McGuinness, D. L. and van Harmelen, F. (2004). OWL

Web Ontology Language: Overview. World Wide Web

Consortium. http://www.w3.org/TR/owl-features/.

Turing, A. M. (1936). On computable numbers, with an

application to the entscheidungsproblem. In Proceed-

ings of the London Mathematical Society, volume 2

42.

KEOD 2009 - International Conference on Knowledge Engineering and Ontology Development

292