Software for Data and Knowledge Management

in Winemaking Fermentations

Pascal Neveu

, Virginie Rossard

, Anne Tireau

, Evelyne Aguera

, Marc Perez

Christian Picou

and Jean-Marie Sablayrolles

INRA/SupAgro, UMR 729 MISTEA, F-34060 Montpellier, France

INRA/SupAgro, UR 50 LBE, F-11100 Narbonne, France

INRA/SupAgro, UE 999 Pech-Rouge, F-11430 Gruissan, France

INRA/SupAgro, UMR 1083 SPO, F-34060 Montpellier, France

Keywords:

Knowledge Management, Ontology, Fermentation, Winemaking.

Abstract:

An increasing amount of data is generated by the on-line monitoring of biotechnological processes. Classical

data management solutions, which have proved effective in many application domains, are not efﬁcient at deal-

ing with scientiﬁc data in life science. We describe a management software of data from wine fermentations

and associated knowledge. The information dealt with in this framework relates to the knowledge of real-time

events occurring during fermentations. The data have been entered into a database and we propose an organi-

sation of this knowledge to improve efﬁciency, based on the use of methods and tools from the Semantic Web.

A speciﬁc ontology of events (faults or enological operations) is used to automatically identify wrong on-line

measurements which clearly improved data quality and understanding.

1 INTRODUCTION

Alcoholic fermentation, a key stage in winemaking,

is likely to change considerably in the future. Indeed

winemakers are increasingly of the opinion that tools

for controlling fermentation according to the type of

wine desired is becoming essential and the use of sen-

sors, for real time monitoring, is becoming increas-

ingly feasible.

Instrumentation can be used to automate the mon-

itoring and allows the development of Information

Systems suitable for the storage of data, kinetics and

all types of information and facilitating the use of

this knowledge to develop more ambitious treatments.

Progressive semantic annotation of data (Hignette

et al., 2007) and source descriptions are of particu-

lar interest for food processes (Sall et al., 2009). Such

tools should prove very powerful in the long term, not

only for reconstructing a history of events (traceabil-

ity) but also providing extensive information to im-

prove management practices (decision support).

This paper deals with a ﬁrst step in this direction,

focusing in particular on the validation of measure-

ments, taking into account knowledge about faults

and operations of particular interest in winemaking.

It is original in that (i) it makes use of a database of

results from the on-line monitoring around 4500 fer-

mentations and (ii) it implements techniques (based

on ontologies and reasoning) novel for this type of

application. This study was carried out in the context

of a European project, CAFE (Computer-aided food

processes for control engineering). All of the devel-

opments are implemented in an Information System

called Alﬁs

2 MATERIALS AND METHODS

2.1 Fermentations

Very few industrial-sized tanks are currently equipped

with an on-line fermentation monitoring system and

there is therefore no corresponding database. How-

ever, we have many data acquired from pilot studies,

or laboratory-scale studies. Indeed, fermentations are

monitored by automatic measurements of CO

release

and the temperature of the fermenters is controlled.

In pilot scale, CO

output was measured with a mass

ﬂowmeter and the temperature of the tanks is con-

trolled by opening or closing solenoid valves in either

http://alﬁs.supagro.inra.fr/

369

Neveu P., Rossard V., Tireau A., Aguera E., Perez M., Picou C. and Sablayrolles J..

Software for Data and Knowledge Management in Winemaking Fermentations.

DOI: 10.5220/0004138803690372

In Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD-2012), pages 369-372

ISBN: 978-989-8565-30-3

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

a hot water circuit or a circuit connected to a refriger-

ation unit.

production and temperature were monitored

on-line, with control and supervision software writ-

ten in Labview language. This software can be used

to add symbolic data and metadata describing eno-

logical operations such as the addition of nutrients or

cap punching, observations and faults. Factors other

than CO

and temperature were monitored manually,

off-line and the data obtained were also input into the

database. Complex data were recorded such as the

principal by-products of fermentation, as determined

by chromatographies (liquid or gas), histograms of

the distribution of yeast populations into size classes.

2.2 The Information System

Each site is characterised in terms of its methods of

measurement, equipment, organisation, etc. The In-

formation System combines the acquisition of (i) on-

line data from sensors controlled by supervision soft-

ware, (ii) off-line data from biological analyses of

fermentation by-products (see 2.1) and (iii) symbolic

data (expert opinions, operations, faults, etc.). These

data and metadata are formalised in XML and or-

ganised into a generic form. It provide a ﬂexible,

generic method for managing heterogeneous, multi-

source data.

Data from the various sites are available from

the Web via a HTTP server (Apache) and a DBMS

(MySQL). The database contains the measurements

and various data. However, knowledge, such as pro-

fessional know-how, cannot easily be used with a re-

lational DBMS. Indeed we needed (i) searches re-

quiring reasoning such as subsumption (ii) providing

active hypertext links and semantic (e.g. to identify

the speciﬁc fermentations used in an article and vice

versa), and (iii) dynamic management of the data and

knowledge relating to professional know-how (con-

sistency sequence of events).

We used ontologies to formalise, make use of pro-

fessional know how and for the ”intelligent” manage-

ment of all the faults, operations and decisions com-

ing into play during fermentation. These ontologies

are written in Ontology Web Language (OWL) and

semantic annotations with the Resource Description

Framework (RDF) syntax. These ontologies were de-

signed with the ”Ontology Development 101” (Noy

and McGuinness, 2001) method. In our software ap-

plications, the ontologies and the RDF annotations are

stored in the database then exploited by the semantic

engine CORESE (Corby et al., 2004) which enables

to infer and process SPARQL queries on RDF annota-

tions and OWL ontologies. It also provides the prop-

erty paths functionality. The Jena API is used for the

RDF annotation management.

Data and Treatment. The incorporation of data

and knowledge into databases makes it possible to

standardise access to this information. We have de-

veloped applications in PHP, JAVA and R languages,

through the R-ODBC package. Web applications gen-

erate a lot of scientiﬁc graphs (Highcharts library or

by R) that often need be enriched. Statistical treat-

ments, written in R, can access ad hoc data via a sim-

ple SQL request and are controlled by JAVA modules

using the ontologies. The use of Semantic Web tools

and methods makes it possible to separate knowledge

from the control mechanisms of the computer lan-

guages routinely used. Classical approaches could be

used to write an application, but it would be difﬁcult

to control the complexity of the effects of successive

changes.

3 FERMENTATION KINETICS

In the database, fermentation kinetics is expressed as

the rate of CO

production (dCO

/dt). Several fac-

tors have a signiﬁcant effect on fermentation kinet-

ics and hence on the quality of wines. This knowl-

edge had been formalized in OWL Lite and this makes

possible data cleaning and analysis (curve clustering,

data mining, etc) on a large number of kinetics.

Effect of Enological Operations. Various opera-

tions may signiﬁcantly modify the shape of the curves

of CO

production rate. Nitrogen supplements di-

ammonium phosphate (DAP) in most cases are rou-

tinely added to cause a rapid and sustained increase in

dCO

/dt. These additions are usually combined with

oxygenation, which entails opening of the tank for a

few minutes. During red wine making, the solid mat-

ter rises to the surface and forms a cap. This cap is

immersed several times during the course of fermen-

tation. It increases dCO

/dt, following the resuspen-

sion of yeasts trapped in the solid matter. Other less

common processes are carried out during the produc-

tion of particular types of wine. One such process

included in this database is the partial reduction of

alcohol content (Aguera et al., 2010) by vacuum dis-

tillation or CO

stripping.

Anomalous Points and Malfunctions. The graphs

of the CO

production rate may include anomalous

points that do not correspond to the true rate of

KEOD2012-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment

370

Figure 1: Effect of enological operations or faults. 1: com-

bined addition of oxygen and di-ammonium phosphate, 2:

cap punching, 3: breakdown.

production (Figure 1). Such points may re-

sult from the momentary opening of the tank dur-

ing an enological operation, decreasing dCO

/dt or

completely stopping CO

release. Other operations,

such as sampling, may also disrupt dCO

/dt mea-

surements. Anomalous measurements may also be

obtained for temperature, generally coinciding with

anomalous dCO

/dt readings.

The occasional anomalous results must be distin-

guished from disruptions caused by faults and fail-

ures, which may have several causes sensor or tem-

perature control system for example and which may

affect only one tank or the whole installation. These

faults or failures may signiﬁcantly affect the monitor-

ing of fermentation (e.g. ﬂowmeter fault), its progress

(e.g. bad temperature control) or both. Detection is

essential. Certain malfunctions can be detected only

with numerical data, whereas the detection of others

requires symbolic data or metadata.

4 RESULTS AND DISCUSSION

Database and Ontologies. A database was created

and was improved by the design of two highly tar-

geted ontologies. Then, applications were devel-

oped or deﬁned for data validation and construc-

tion of complex queries over winemaking fermen-

tations. The database provides access to measure-

ments taken on-line, uni- and multidimensional mea-

surements taken off-line, symbolic data and meta-

data. These data are organised according to ontolo-

gies of operations and faults (Figure 2). Since its ini-

tial establishment in 2004, a complete inventory of

this database has been carried out. It contains about

4500 fermentations, 52 of which were found to have

been stored incorrectly. Minor manual corrections

were made to render the data for these fermentations

accessible. No information was lost.

The number, type and nature of events occurring

during the course of a fermentation generate a com-

Figure 2: Extract of the ontology of local faults.

Figure 3: Example of semantic inference to infer new rela-

tions (in bold) between concepts from primer acquisitions.

bined analysis that is difﬁcult for applications to man-

age. The ontologies concern the rationale underly-

ing the hierarchy used for complex queries and the

validation of on-line measurements. Here ontologies

supply real beneﬁsts. Indeed, the integration of new

knowledge or knowledge management is easier than

in classical applications and software modules are less

sensitive to technical evolutions and modiﬁcations of

experiments. In this case, ontologies provide an ef-

ﬁcient access and understanding of data for different

end-users and to formalize technical itinerary. For ex-

ample, by using subsumption, it is possible to ask for

all general faults (Figure 2) without the user knowing

all the different types. This set-up makes it possible

to reply to queries such as ”which fermentations have

one alcohol content reduction process and no control

system faults?”, without knowing all the types of con-

trol system fault or the methods of alcohol content

reduction. Using transitive property such as ”must-

BePreceed” has improved consistency and complete-

ness of the information about enological operations

(Figure 3). The application thus has the know-how

required to deal with complex queries of this type.

The perfection of this knowledge was an important

outcome of our approach.

Software Application for Measurement Validation.

One of the major concerns of users is that only data

providing an accurate picture of the state of the sys-

tem should be stored. In our case, measurements for

temperature and gas release may be disrupted by eno-

logical processes as well as faults. Moreover, the ﬂow

SoftwareforDataandKnowledgeManagementinWinemakingFermentations

371

rate is affected differently by opening the tank or a

ﬂowmeter fault. Using ontologies and the R language,

we have been able to go beyond the limits of the sys-

tems generally used.

The component developed uses an ontology for

symbolic data and the R language to ﬁt the curves on

measurements using a non parametric method (spline

estimator) under constraints. The curve shape depend

on the event sequence. To ensure a good ﬁt, the num-

ber of nodes and the constraints are generated by us-

ing the event ontology and a new dedicated one on-

tology for impacts. This ontology of application was

designed to classify impacts of events on kinetics and

to be able to automatically generate a speciﬁc R treat-

ment. So, this component uses the ﬁtted curve, the

type of event (operation or fault) and comments from

experts (Neveu et al., 2008) to validate or invalidate

the measurements. To quantify the efﬁciency of the

application, we analysed 20 fermentations with many

operations and malfunctions. A detailed analysis of

the dCO

/dt curves by experts pointed out 242 wrong

or litigious measurements (among more than 20 000).

Using the application, we found very similar results,

with only 21 misdiagnoses, i.e. less than 9% differ-

ence. This result indicates a very good rate of data

invalidation and should prove a good behavior of our

ontologies.

Perspectives: Determination of the Usefulness of

Fermentations. Another concern of users is the

conﬁdence they can have in the fermentation and

whether the data can be used in other contexts for an-

alytical and diagnostic purposes. Certain faults (de-

fault of temperature regulation, for example) happen

at key moments and are therefore disastrous, prevent-

ing the subsequent use of the fermentation data. A de-

scription of this knowledge as a function of the type

and time of the fault is essential, to enable the in-

formation system to determine whether a data set is

valid. This requires taking the investigation further

and seeing whether the whole data set can be inter-

preted.

5 CONCLUSIONS

Advances in alcoholic fermentation monitoring and

control necessitate new tools for data management.

Databases need to be associated with tools for the

representation and management of knowledge. This

study shows an example of a reliable and user-

friendly application based on ontologies, with the

aid of Semantic Web tools and technologies (JENA,

CORESE, PROTEGE, etc.). These languages and

softwares, such as RDF(S) and OWL provide reason-

ing and they are of great interest for the management

of complex data.

The applications developed automatically detect

the inconsistencies caused by the principal malfunc-

tions during the fermentations and distinguishes be-

tween those of little importance and those that might

disrupt the process. Disruptions related to enological

operations (such as cap punching, pumping over) are

also detected and analysed. The application was vali-

dated using a database containing about 4500 fermen-

tations. It might represent a powerful tool to improve

fermentation management and control in the near fu-

ture.

The method of formalization that separates the

expert knowledge allows to take into account eas-

ily changing business practices. It allows to easily

adapted the developed software for other cases in food

process (cheese making, brewinb, etc).

REFERENCES

Aguera, E., Bes, M., Roy, A., Camarasa, C., and Sablay-

rolles, J.-M. (2010). Partial removal of ethanol during

fermentation to obtain reduced-alcohol wines. Ameri-

can Journal of Enology and Viticulture, 61 (1):53–60.

Corby, O., Dieng-Kuntz, R., and Faron-Zucker, C. (2004).

Querying the semantic web with corese search engine.

Proceedings of European Conference on Artiﬁcial In-

telligence (ECAI) Valencia, Spain, pages 705–709.

Hignette, G., Buche, P., Dibie-Barth

elemy, J., and Haem-

merl

e, O. (2007). Semantic annotation of data tables

using a domain ontology. In Discovery Science, pages

253–258. Springer.

Neveu, P., Rossard, V., Aguera, E., Perez, M., Picou, C.,

and Sablayrolles, J.-M. (2008). Gestion de donn

ees

et de connaissances pour les bioproc

es. Extrac-

tion et Gestion des Connaissances, Atelier : fouille

de donn

ees temporelles:21–28.

Noy, N. and McGuinness, D. (2001). Ontology develop-

ment 101. Knowledge Systems Laboratory, Stanford

University.

Sall, O., Lo, M., Gandon, F., Niang, C., and Diop, I. (2009).

Using xml data integration and ontology reuse to share

agricultural data. International Journal of Metadata,

Semantics and Ontologies, 4:93–105.

KEOD2012-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment

372