Datapipe: A Conﬁgurable Oil & Gas

Automated Data Processor

Florent Bourgeois

1,2

and Pierre Arlaud

Actimage GmbH, Hafenstraße 3, 77694 Kehl, Germany

Universit´e de Haute Alsace, MIPS EA 2332,

12 rue des Fr`eres Lumi`ere, F-68093 Mulhouse, France

{florent.bourgeois, pierre.arlaud}@actimage.com,

florent.bourgeois@uha.fr

Abstract. Exploration and Production companies need to know where are the

oil and gas reservoirs, how much they hold, and whether they can proﬁtably pro-

duce oil and gas. Data collection, management and analysis are therefore central

to the industry. As in most application areas, raw data are processed, implying

several tools and experts interactions. Nevertheless, the oil an gas sector data

processes imply unusual scale of, multimodal and long-lived, data alongside with

complex analysis. DataPipe is a research project funded by the Eurostars Program

of the European Commission which purpose is to develop a platform, toolkit and

pipeline for the intelligent, rule-based selection, management, analysis, publish-

ing and display of heterogeneous multimodal data in the oil and gas sector. This

paper describes Actinote 4.0, a ﬂexible web-based platform, which is developed

to respond to the speciﬁed Datapipe context and is dedicated to the creation of

speciﬁc domain-based process assistant applications that are certiﬁed by expert

systems.

1 Introduction

Oil and Gas (O&G) are currently worldwide primary resources. Exploration compa-

nies perform seismic surveys to locate reservoirs and interpret physical properties of

the rock. These surveys generate data which have then to be analysed in order to deﬁne

the proﬁtability of the reservoir. They can be complemented by aerial or satellite pho-

tography, gravitational measures. Plus, test borehole can be drilled to bring data with

with positional, radioactivity, temperature, porosity, resistivity and other measures that

enhance the geological model. Surveys have been made for years, implying technology

evolution. The retrieved data have then changed and their storage also did. Improve-

ments in analytic and drilling techniques as well as shifts in the global economy [1–4]

can change decisions, so survey data have a long shelf life. A deposit that was once

uneconomic may require a new analysis and interpretation thirty years later. There are

problems of maintaining, tracking and accessing very large data volumes for decades,

ﬁnding and mining old data from different storages, reading and interpreting different

media formats and ﬁle types. The data are then multimodal, long-lived and difﬁcult

to manage. The expected valuable lifetime of seismic data ranges from four to twenty

years - up to the point where newer seismic technologies make it cheaper to re-survey

Arlaud P. and Bourgeois F.

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor.

DOI: 10.5220/0006163700750096

In The Success of European Projects using New Information and Communication Technologies (EPS Colmar 2015), pages 75-96

ISBN: 978-989-758-176-2

than reprocess the data. As a result, data from many different types, formats and loca-

tions have to be found, managed and processed.

Over the past 20 years, geologists have built well databases, geophysicists have de-

veloped ways to handle seismic data and reservoir engineers have managed production

data. There has been no common approach and the separate initiatives have created so-

lutions with dedicated tools and procedures. Nevertheless, the current trend is to is to

move from unstructured collections of physical and digital data toward structured sets

of digital exploration, drilling and production data.

Big data and High Performance Analysis are now terms commonly employed to

refer to O&G data management in industry white papers [5, 6], and scientiﬁc commu-

nities[7]. This is due to the massive amount of data it represents, the complexity of the

algorithms employed and the variety of data format.

Within this context, Actimage, Dalim Software GmbH, Ovation Data Services Inc.

and Root6 Ltd initiated a collaboration around the DataPipe project, which has been

selected and funded under the European program EUROSTARS. DataPipe aims for

proposing a solution to easily create automated data processor in the domain of O&G

which can answer these different prerequisites. The project goal is to develop a plat-

form, toolkit and pipeline for the intelligent, rule-based selection, management, analy-

sis, publishing and display of heterogeneous multimodal data in the oil and gas sector.

It will create a ﬂexible system to provide web-based visualisation and decision support

based on the analysis of extremely large datasets. The platform will be extensible to big

data mining, analysis and display in a wide range of industrial sectors.

1.1 Project Goals

DataPipe will create a new approach to multimodal data management, data mining and

presentation, based on process modelling and metadata-based process automation. A

new methodology has to be implemented, in order to manage the enormous data vol-

umes, the range of asset types and the processes applied to them in the oil and gas

exploration sector. Based on research advances in a number of domains, the DataPipe

platform comprises the following elements:

– Intelligent data workﬂow tools to control the selection and ﬂow of data from mul-

tiple sources, their processing and publication on multiple platforms.

– A data selection and management framework, which will deal with the connection

to multiple data stores across different APIs and ETL systems.

– A DataAgent processing platform, to connect to archiving systems and tape robots,

with a Hierarchical Storage Management (HSM) system to bring ﬁles back from

storage just in time for processing.

– Access control and security systems to protect the integrity of data from unautho-

rized personnel or attack to the data store.

– A cross-platforminformationdisplay system to receiveinformationfrom the frame-

work and instructions from the job ticket, to tailor the publication.

To implement all of these goals is unsustainable for any of the project partners.

Indeed, they require expertise and research speciﬁc to too many different domains. This

is why each of the ﬁrms of the project has to handle a speciﬁc part according to their

afﬁnities with its themas.

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

1.2 Collaborating Partners

Being a ﬁrm proposing safe and reliable storage systems, Ovation Data Services Inc.

has specialised in data management for the full life cycle of exploration data in the

exploration and production industry. Their services cover all forms of seismic and well

log data transcription and migration, plus format conversion, recovery, remediation and

de-multiplexing. Their role in the project is to deﬁne the domain speciﬁc expertise, the

existing processes and their results and provide expert feedbacks. Aside of this, they

will have to furnish web services to access to their services in order to handle the data

all their stages (acquisition, processing, storage, archive, deep storage, destruction) and

specify their access requirements.

Root6 Ltd ContentAgent systems address workﬂow problems in the (digital) ﬁlm,

video and advertising industries. The are designed to support multi-format video media

encoding, faster-than-real-time transcoding and streaming transcoding. Their expertise

in multiple format video transcoding processes automated and parallelized through the

creation of complex conﬁgurable workﬂows is currently limited to local processing.

Through this project, they aim to extend the system for Cloud-based operations and

adapt their system on the multiple format data found in the domain of O&G to provide

faster and automatized data processing based on parallel computing agents.

Dalim Software GmbH specialises in software systems implementing JDF for print,

in which XML elements describe all the production processes and material types, re-

gardless of the individual workﬂow. Their solution enables the creation of processes

combining user steps (approval), technical steps and stage steps (milestones) with rel-

evant results stored on databases accessible anywhere through web interface based on

the user identiﬁed role. They bring to the project their expertise in processes designs

implying automated steps and user interactions enabling complex processes requiring

human supervision and decisions.

Actimage has been involved in many complex online projects involving cloud-based

architecture and has wide experience of combining different mobile technologies. The

existing Actinote system is designed to gather and deliver user- and context-centric in-

formation for mobile professionals. The system employs scalable Cloud architecture,

interacting with multiple data sources (ERP, PIM, etc.), a blackboard multi-agent sys-

tem to handle data and provide security, delivery and publishing of user related content.

Our systems provide a solid expertise for the delivery, security, multi-platform and in-

telligent interface aspects of DataPipe. Based on this, plus our expertise in mobile and

recent web technology applications and requirement based solutions for professionals,

our role in the project is to implement user interfaces for the creation of workﬂows co-

herent with the speciﬁc domain of application that the O&G context represent, mobile

devices application for the execution, supervision of decision support based workﬂows

with means of publishing and displaying the information in format tailored to the user

needs.

This paper focuses on Actinote 4.0, the solution Actimage developed in the deﬁned

context. First it details the goals and requirements speciﬁc to the development of a

platform for the creation of domain speciﬁc based workﬂow applications. Afterwards

it presents the Model Driven Engineering (MDE) approach the platform is founded

on. There will then be a presentation of the implementation choices to implement the

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

method into the Actinote 4.0 platform and a demonstration of the extensions required

to implement a reduced O&G data management as a domain speciﬁc case to evaluate

the solution. Eventually the paper concludes on a review of the results of the project

and further work perspectives.

2 The Solution Speciﬁcation

The previous section highlighted the complexity of O&G data management and the

need for a solution to standard their processing which includes complex and time con-

suming computer assisted activities alongside few user decisions making. This section

details the requirement the solution must ﬁll through a stakeholder point of view and

then explains how such a speciﬁc domain solution can be abstracted to a multi-domain

data processing solution addressing a wider range of applications.

2.1 Requirement Analysis

In order to propose such a solution it is important to point out the different actors that

are likely to interact together for the generated applications lifetime. We identiﬁed four

main users in this context.

Software Development Entity. Actimage, as a software development entity aims to

propose a solution responding to the domain needs. This implies the generation of ap-

plication able to handle O&G data by retrieving them from their multiple identiﬁed

storages and to process them according to data management expert in order to then ap-

ply O&G experts analyst speciﬁc processes. These processes then result into bankable

data that have to be reported and stored for later use.

Considering the number of experts, processes, data storages and data formats, it is

not possible to create one application able to respond to any of the processes. On the

other hand, it is unthinkable to produce speciﬁc development projects for every pro-

cesses. It would lead to the production of an unlimited number of projects in order to

proposeto each user the speciﬁc process theywant to executewith their speciﬁc require-

ment concerning data storages and expected results. It is then obvious that Actimage re-

quires the development of a solution implying an application editor based on software

engineering to enable fast and easy development of data processing applications. Thus

Actimage will be able to software engineering solutions based on the speciﬁc activities

of users instead of constraining them to a rigid generic process.

End-users. The end-user brings to the application creation process its expertise on

his speciﬁc domain of activity, his habits and his expectations on the look and feel of

the application. The application process must assist the user in his task and then be

a support in his common activities and not a burden. It is then important to give the

software architect tools to answer the user needs. It is important to note that the end-

users expertise implies speciﬁc business vocabulary. hence a gap between the software

architect which manipulates software artefacts and the end-user languages naturally

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

arises. The solution then have to reduce this semantic gap in that it proposes to the

software architect artefacts corresponding to the end user’s vocabulary. The end-user

describes a data management process. This description contains the speciﬁc data he

wants to use, which implies their format and location, the different algorithms to apply,

the decisions to take according to the data and the expected output and output format. A

data processing assistant like this becomes pertinent on smartphones if the processing

can be outsourced on a distant powerful network of computer and if the application

is able to display the pertinent results, enables user decisions and assignation of tasks

to other users. Plus if the application present ways to assign tasks to other users, it

would be possible to create processes with several end-users interacting to combine

different expertises on the data. This requires the solution to handle device to device

communication for the assigning and the data sharing activities.

Data Management and O&G Specialists. It is important to note that the generated

process if not correctly described can lead to not applicable process in the domain.

Indeed, the end-user is an expert in the analysis of speciﬁc data. But we mentioned

in the previous section that before the data are in the analysis format, they have to

be located and retrieved from several data storages, homogenized and made accessible.

Due to the disparity of current solutions, O&G experts are used to handle such processes

but it is a burdensometask without any of their expertise added value. On top of this, the

algorithms of O&G experts analysis processes also imply a set of rules and conditions

that only specialists knows. Besides, speciﬁcation of a process can also entail errors.

Therefore, the different activities composing a full data management process im-

ply activities which present constraints such as speciﬁc order (archive data must be the

last activity), speciﬁc data type as input (only unarchived data are convertible).These

speciﬁc domains activities being at the center of the processes, it is mandatory that the

solution proposes mechanisms to handle them correctly. The speciﬁc domain knowl-

edge must be embedded in the application creation process in order to assert that the

generated processes are possible and manageable considering the limitation of the sys-

tem and the sciences of the data management and analysis. Consequently, an analysis

of the domain semantic must be performed on the process.

Data management and O&G specialists are then required to deﬁne the domain se-

mantic and the set of rules to apply on the system. Their role is then the description

of a prescriptive framework to strictly observe. Since several types of analysis can be

processed by the different applications, depending on their expertise target, it is of ﬁrst

interest to propose the creation of rule libraries. Each library embeds a speciﬁc exper-

tise domain tool, to enable the speciﬁc elements of the expertise in the required process

creation and avoid to present the unnecessary elements.

Architect. At this point, Actimage solution enables the creation of correct applications

in the domain of O&G data management and analysis through the use of an application

editor. The person in charge of the application creation is called the architect. The archi-

tect has to followthe end-user application descriptionin order to generate an application

able to assist him and automize his activities.

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

The architect goes through two activities. First he engineers his editor environment

choosing over the set of specialists libraries and importing the necessary ones. This en-

ables the use of speciﬁc domains tools according to the end-user domain and needs. He

then performs a process engineering activity creating the application within the editor

according to the description given by the end-user. The solution has then to propose

tools to compose processes implying end user centric and fully automatic steps. Both

must present user interfaces, respectively one to enable user interaction and one for su-

pervision, enabling the end-user to oversee automatic steps evolutions and anticipate

upcoming steps.

The different actors interactions described are illustrated in the ﬁgure 1.

Data

process

defines

behavior

process

model

application

based onfulfills

defines

End-user

Architect

reads

Solution

models using

creates

Actimage

employs

respond to needs

Specialists

improve

uses

Fig.1. Solution actors interactions.

To resume, the solution has then to address the following requirements:

– Deﬁnition of a complex process composed of automated and user centric activities.

– Execution of process on mobile platforms.

– O&G based processes creation assistant and validation.

– Manipulation of heterogeneous data and data sources.

– Distant processes execution for optimisation and balance of device processing load.

– Role based task assignations to users.

– Web based user interface for the creation, assignation and supervision of processes.

2.2 Solution Generalization

Until now we deﬁned that ACTIMAGE have to furnish tools to create O&G data man-

agement processes and to supervise the execution and results of such a process.

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

It appears from the requirement analysis that the processes are mainly focused on

data managementand manipulation. Providing that the communicatingservices are well

handled and that the processes are validated by the data management and O&G seman-

tic when created, the format of the data in the application is coherent and the user can

manage his work without worrying about the data coherence.

This solution is then based on two speciﬁc domains in order to automatize and

simplify the expert knowledge based process creation. Since processes are implemented

in supervised applications, it is easy to assert that the process status (created, running,

terminated) and it asserts that, if the application is correctly described by the end-user,

there is no step in the process that can be forgotten.

Such process based applications make sense not only in DataPipe’s project con-

text but also in various other domains. In custom-made industry, the prospect phase of

meeting the client, understanding its wishes, capturing the environment and estimating

the cost of the product would widely beneﬁt of applications that assert that the whole

process is performed, that simplify the collection of data either from the ﬁrm store to

present the products or from the on site visit in order to capture the context of the sale.

This domain presents the same requirement as O&G. There are heterogeneous data

manipulation, such as camera pictures, measures, notes. Sensors, user interface, cloud-

based database are the domain’s multiple data sources. That’s why all kinds of skills are

employed, such as knowledge on the products sold, on the sensors use, or on the price

calculation.

It is even imaginable to create applications meant to assist everyday life activities.

Indeed, cooking, sport, handiwork are activities that can be represented as processes,

requiring user interfaces to guide and assist them, which consume not only data but

also materials and create results (meals, health status and manufactured furnitures re-

spectively).

It is then possible to deﬁne a common metamodel the three cited examples corre-

spond to which. This metamodel is illustrated on the ﬁgure 2.

Thus, instead of a solution based on the veriﬁcation of two domains, Actimage pro-

poses to deﬁne a solution for the creation of multi-domain heterogeneous data handling

processes. The processes are validated by their coherence with the domains semantic.

The creation of the solution hence implies several modules:

– domain speciﬁc process creation editor.

– heterogeneous data manipulation and presentation interfaces.

– knowledge semantic modeling and veriﬁcation.

– mobile device applications creation and execution.

The following section presents a MDE approach which was developped to design

Actinote 4.0, the generic solution implemented to respond to the mentioned require-

ments.

3 Model Driven Engineering Approach

As explained in the previous section, the challenges of the Datapipe project can not be

resolved by implementing applications for speciﬁc user cases unless a whole solution,

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

seismic data

drilling data

well logs

exploration data

analysis

formating

decision

data-mining

normalized

data

sale

information

decision

aid

O&G

heterogeneous

input

Process

expected

result

process steps

Cooking

food products

cooking tools

recipe

cutting

warming

backing

waiting

food products

pie

meal

expected

result

Fig.2. Processes models and metamodel.

targeting the creation of domain speciﬁc data handling processes applications, is de-

veloped. This sections describes the structure and producing logic of Actinote 4.0, the

solution developed by Actimage.

3.1 Background

Model Driven Engineering (MDE) is a software development method that considers

models as the ﬁrst class artefacts, even considering that everything is a model [8,9]. Its

purpose is to rely on models as development entities and then generate models of lower

levels or even code, mapping between models abstractions, model evolutions, system

behaviors or applications through the use of model transformations[10,11].

MDE commonly deﬁnes models as a representation of an aspect of the world for a

speciﬁc purpose. A model never represents the full system, but an abstraction of the sys-

tem complete enough to represent all the required feature for a given use. A metamodel

is a representation of a language able to describe lower abstraction level models. All the

models described by the language are conform to the metamodel. This conformance

relation thus asserts that the model is constrained by the semantic of its metamodel.

A model transformation takes a model as source and produces another model as

target. A transformation metamodel is a mapping between the source model metamodel

and the target metamodel.

Surveys [12,13] proves that MDE, while being a more than ten year old method, is

still a recognized method in software industry and several development teams use it in

order to approach complex systems development.

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

OMG’s Model Driven Architecture (MDA) [14] is one MDE initiative with a three

layer structure. A Computational Independent Model (CIM) describes the business

model (e.g. the UML grammar).Then it is transformed into a lower level model through

the use of the language it represents. This generates a Platform Independent Model

(PIM) which is in our example a speciﬁc model described with UML. Last, the PIM

model is transformed into a Platform Speciﬁc Model (PSM). The generated PSM is the

implementation of the system described by the PIM with technology speciﬁc to the tar-

geted environment. In our example can be the android application code. Even though

our approach does not matches exactly the MDA structure, we will use the CIM, PIM

and PSM terms to identify the level of this paper upcoming models.

Using models to specify the system functionalities and then apply model transfor-

mations on them, so the implementation is generated, simpliﬁes creation of a group of

applications sharing the same description paradigm. It is possible to deﬁne a Domain

Speciﬁc Language (DSL) which is a simple language optimized for a given class of

problems[15]. This class of problems is named domain. A DSL enables an easily de-

scription of applications in a speciﬁc domain using a reduced set of elements. Since the

language proposes a reduced set of elements, the model description and mappings are

simpliﬁed compared to general programming languages, such as C++.

3.2 The Approach Global Structure

As described before, MDE approaches are based on models and their transformations

to describe software behavior and automize their implementation based on this descrip-

tion.

Thanks to MDE it is possible to describe a DSL dedicated to the modeling of pro-

cesses. This DSL is represented by a CIM model. Moreover,the architect editor is based

on the DSL. This architect composes the application description with terms extracted

from the DSL and thus creates the application process model. This model describes the

functionality of the application without considering the implementation speciﬁcities. It

is then a PIM model. A model transformation consumes the process model afterwards

in order to produce the PSM corresponding application.

Nevertheless, the domain speciﬁc semantic brought by the specialists’ knowledge

is complex and implies deep modiﬁcations in the DSL with addition of domain speciﬁc

terms for the architect and, more importantly, of semantical constraints that are hard to

represent on models.

Indeed, constraints are often added to the modeling language by the addition of

ﬁles containing the constraints’ descriptions in text such as Object Constraint Language

(OCL). This solution presents several downsides. Constraints are placed over objects

and object relations, complex constraints are difﬁcult to implement with this approach.

This is a problem considering that the domains might be quite complex (e.g. relying on

measure semantic). Also, they are in separated ﬁles that have to be updated in parallel

with the model evolutions. Besides, the semantics have to be analysed on full processes

that are only limited by the end-user’s description.

In this fahsion, instead of dealing with cumbersome constraints programming, we

chose to dedicate the semantic analysis to expert systems detailed later in this paper.

We consider that the different domains semantics do not overlap.

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

To resume, our MDE approach’s general structure is composed of three modules

and two model transformations that are represented on ﬁgure 3. The editor enables the

architect to model his processes with domain speciﬁc elements, the expert system anal-

yses the domain speciﬁc semantics associated to the process to allow only the creation

of coherent processes and eventually the application is the ﬁnal distributed product with

which the end-user interacts to execute his process.

Embedded in mobile device

Process

Editor

Process Model

Domain Specific

Model

Expert

System

Application

Model

Mobile Embedded

Application

Uses

Error in model

Uses

Application Model

Transformation

Domain Model

Transformation

Architect

End-User

Consumes

Generate Generate

Verify Executes

Fig.3. MDE approach global structure.

This section now provides more details on the different modules composing the

approach.

3.3 The Domain Speciﬁc Language Deﬁnition

In order to be able to create, represent and produce process-based editors and applica-

tions, we have to deﬁne a DSL able to describe all the possible processes models.

A standard description commonly used to model data processes is the workﬂow

modeling paradigm. Workﬂows are deﬁned as the automation of a business process

presenting several activities, processing any kind of data and connected through transi-

tions[16]. It is a widely used paradigm based on simple elements (activities connected

through transitions) deﬁned as being able to represent any kind of process[17]. A large

community works on normative use of its elements[18,19]. In our context, this abstrac-

tion can be used as metamodel used to produce the processes representing models.

Looking at ﬁgure 2, it is possible to make a direct parallel between the workﬂow

activities, their inputs/outputs and our processes’ metamodel. It is also possible to con-

sider a user choice as an activity that transforms two potential futures processes’ path

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

as the one that will be executed. The sole difference is that workﬂow activities are

connected through transitions that can present conditions. These transitions conditions

might be associated to either the presence of a correct data (resulting from an upstream

activity) or user actions.

Using workﬂowsas a standard representation for our processes’ metamodelpresents

three major advantages. First, it is a simple abstraction that any software architect is

used to manipulating, which makes the editor’s main elements easy to assimilate. Sec-

ond, it is possible to propose to the architect complex speciﬁc domain elements as sim-

ple activities or interfaces. This reduces the semantic gap between the end-user and

the architect during the process modeling phase. This enables the creation of processes

with less interactions with the end-user to require more precise description. Finally, a

lot of workﬂows editors already exist, for example Datapipe partners already propose

solutions based on workﬂows created through editors. Since those editors create mod-

els corresponding to the workﬂow metamodel, it is possible to use them as an editor

approach. For this to be possible, the sole requirement is that the editor can be extended

to provide the domain speciﬁc elements to the architect and also for the model trans-

formations to be created. Figure 4 illustrates the impact of the use of several workﬂows

editors on the transformation between the process editor and the expert system.

Dalim

workﬂow

editor

Dalim workﬂow

model

Dalim2checker

transformation

O&G Model

Expert

System

TiersA2checker

transformation

TiersB2checker

Transformation

Architect

Use

Error in model

TiersA

workﬂow

editor

TiersA workﬂow

model

TiersB

workﬂow

editor

TiersB workﬂow

model

O&G semantic

Generates Generates Generates

Consumes Consumes Consumes

Generates

Consumes

Based

Fig.4. Use of multiple editors.

3.4 Expert Systems

An expert system, or Knowledge-based System (KBS) is an AI System (input, trans-

formations, output) with several blocks which understands expert knowledge and infers

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

behaviours to solve a problem in a speciﬁc task domain. Expert systems were already

used in 1986[20]. They have matured over the years and are now still used especially

with the rising domain of ontologies.

There are two types of knowledge[21]:

– Factual Knowledge: Deductions that an expert system should handle as is. Similar

to the concept of axioms. This knowledge is widely shared and typically found in

textbooks or journals.

– Heuristic Knowledge: The knowledge of good practice, good judgment, and plausi-

ble reasoningin the ﬁeld. It is the knowledgethat underlies the art of good guessing.

It is usually said that knowledge-based systems consist of two parts: a knowledge

base and an engine. Therefore, as shown on the ﬁgure 5, the two basic generic blocks

of an expert system respectively have these two responsibilities.

Factual

Knowledge

Heuristic

Knowledge

Base

symbolises

Expert

System

Knowledge

Representation

Problem solving model

(Paradigm)

queries

Solution

Facts

(data tuples)

Fig.5. An expert system.

Thanks to this approach, we are able to ensure that the semantic of the domains

will be respected. Such a system, being able to check a domain, is also able to do so

with several domains. We did not ﬁnd examples of overlapping domains, hence our not

considering issue of different domains interactions.

3.5 Application

Since our system is meant to generate a family of applications for dedicated domains, it

is expected that a end-user requests several applications for different processes. More-

over,the process of posting applications on stores is cumbersome and there is no control

over how to access these. The platform must provide a solution to ensure the privacy of

the end-user’s intellectual property to prevent any unauthorized access to the applica-

tion, while ensuring the delivery of the application through a simple system.

With this MDE approach, instead of creating a new application for each process

model, we propose to translate the model into a PSM which describes the expected

behavior of the application and the different interfaces (graphical or to services) that it

will use. Then, a unique application will handle any of the descriptions and, based on

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

an interpreter technology, will execute a behaviour corresponding to the descriptions.

The descriptions can be sent to the applicatio through standard push methods on mobile

connected devices, automatically providing the new process to the application once it

has been created and veriﬁed. Combined to a login logic, this allows us to propose a

unique application to group and execute any of the processes the end-user requests.

The interpreter executes the application’s behaviour according to the workﬂow ac-

tivity. Each activity is a milestone in the execution that either starts a process on data

or request a choice from the user. The application then only requires to be able to read

the workﬂow and compose interfaces according to the descriptions made in the models.

That’s why the application must know the editor’s different elements in order to be able

to interpret them on execution.

Hence, we can propose an application for the different existing mobile platforms

which then can handle any of the process models. Which makes the approach able to

target multi-platform mobile devices.

Through this section, we presented a Model Driven Engineering approach which

enables the creation of a family of multi-domains well-founded processes applications.

The next section presents an overview of Actinote 4.0, the Actimage implementation of

such an approach.

4 Actinote 4.0 Implementation

The later section presented an MDE approach which answered the complex require-

ments of Datapipe project. This section presents the Actimage solution Actinote 4.0,

implemented following the presented approach. Several details are considered to be out

of the scope of this paper due to the industrial nature of the solution. This especially

encompasses the different model transformations that will thus not be described.

4.1 The Editor

The editor is the module that is meant to be used by the architect to model the process

executed by the application. We stated that such process implies graphical interfaces,

workﬂow activities and data management. The lack of standard processes has encour-

aged our project team to give users a sense of intuitiveness in the way they can model

their activities. The transfer of their operational process into the Actinote 4.0 platform

is made accessible with a graphical approach: the architect extracts the ﬂow of the end-

user’s process in a nodal diagram (which looks similar to a ﬁnite state graph) and the

description of their constraints.

This normalization and meta-modelling ensures the reliability of the data stores.

Not only will the homogeneity of this assemblage facilitate the computational discov-

ery of patterns in the inputs, but it will also allow the utilisation of safeguards based on

the speciﬁc domains. Since all inputs need to be speciﬁed and enumerated, there is in

fact no way for the mechanism to be semantically ambiguous. Any incoherence can be

spotted beforehand, insuring the integrity of the business knowledge.

The basic metamodel representing the editor DSL is illustrated in the ﬁgure 6.

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

Process

Model

View

Widget

Activity

Document

Model

VariableAction

actionValue actionFlow

Flow

GUI

workﬂow

document

Fig.6. Actinote 4.0 DSL’s metamodel.

With the language represented by this metamodel it is possible to describe any pro-

cesses. The workﬂow is modeled as activities chained together through ﬂow links. Each

activity presents a graphical user interface composed of several widgets. A widget is a

graphical interface element that enables to give access (display and manipulation) to

data or make decisions for the end-user. Variables are abstractions of the data manip-

ulated by the process. The document model groups all the data that must be retrieved

as the process result in order to automatically generate a report. An action is a speciﬁc

domain process to apply on a set of data. Some common or complex actions, such as

the retrieving of multi-storage data in O&G context, are added to the editor when a spe-

ciﬁc domain is imported. The architect also can implement speciﬁc actions with a nodal

diagram dedicated for the data processing. Actions are started either by an activity or

by user interactions on widgets.

During our test activities, we observed that the semantic gap reduction brought by

the use of our DSL and the abstraction of complex processes as importable actions

did not only help the architect understand the end-user speciﬁc domain vocabulary, but

it also enabled the end-users to edit their own workﬂows. So our solution, as long as

it provides the different complex operations of the process as element of the editor,

enables the end-user to model his process model himself.

Therefore Actinote 4.0 is a good ﬁt for the industry because it focuses primarily on

the designing of forms and the web-visualisation of analysis results. The whole idea

behind this work resides in the opportunity for an expert to be relieved of the time-

consuming task load that converting data into a generic form can represent. Thanks to

this effort, geologists, geophysicists and engineers can use the DataPipe platform and

toolkit to publish and display heterogeneous multimodaldata in their realm of expertise.

The principle of this responsibility decoupling is that we separate the business logic of

the process into three parts: the orchestration of its ﬂow with the activities, the algo-

rithmic aspect of each of its steps with actions and the designing of the display that

will provide the users with a mobile access to the process with the widgets. It becomes

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

thus possible to partition the effort for different employees with different qualiﬁcations.

Not only will domain-speciﬁc experts have the ability to engineer process for virtualiz-

ing and structuring production data without requiring any particular skills in software

development, but the technical operators and decision makers will be able to run the

predeﬁned scenarios independently.

4.2 Workﬂow Validation

We mentioned in the previous section that the process models have to be validated in

order to implement them into applications. This implementation approach adds several

users and platforms-based validation requirements. Then, a workﬂow validation goes

through multiples checks:

– Permission and access rights, which may require verifying the coherence of the

rights.

– Semantic analysis of ﬁelds use and their types.

– Semantic analysis of domain speciﬁcities with expert system.

– Vacuity and halting tests (the workﬂow must have steps reaching a end).

– Responsiveness aberration tests for small displays.

– Consistency checks of the actions graph (which is a set of algorithmic nodes).

– Syntaxic analysis of the actions graph.

– Syntaxic analysis of the activities graph.

– Check of all unused elements (may they be variables types, resources, event graph

parts).

Much of these requirements are resolved by the editor’s language with typing of

actions, variables and widgets and are out of the scope of this paper. We will only detail

the expert system validation process.

Rules Engine Implementation of an Expert System. The Actinote 4.0 expert sys-

tem encodes knowledge in ﬁrst-order predicate logic and uses the Prolog language to

reason about that knowledge. It hence uses a rules engine, which is the most common

implementation of an Expert System and based on rules.

The knowledge is represented with a set of production rules Each data is matched

to the patterns described by these rules with algorithms such as Rete Algorithm.

The solving entity is thus an inference engine (a.k.a. Production rules system),

which uses either forward or backward chaining to infer conclusions on the data.

It’s worth noting that, although conclusions are usually implied, their being here

inferred shows that we indeed deal with Artiﬁcial Intelligence, so the system makes

conclusions as humans would.

The ﬁgure 7 shows what the expert system thus becomes. Please notice that the

Knowledge Base is not explicitly added on the diagram. The confusion between a

knowledge base and the way it is represented in our system is usually made on pur-

pose: in a rules engine, we call knowledge base the set of production rules, and not the

actual knowledge that the experts have in their brains.

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

Production Rules System

Knowledge

Representation

(Rules)

Inference Engine

Rete

Algorithm

Solution

Facts

(data tuples)

+ Condition(IF)

+ Action (Then)

+ Weight / Reliability

(Production) Rule

Forward

Chaining

1..n

Fig.7. An expert system based on forward-chaining.

4.3 Data Management

The ﬁrst target for Actinote 4.0 is to answer to the aforementioned O&G domain data

management complexity proposing a front-end to manage the enormous amount of data

and their heterogenous nature.

Big-data Choice. Actinote 4.0 has a native support of MongoDB sets of databases.

MongoDB is the most used documental database which puts the emphasis on multi-

datacenter scalability, resulting in big-data model ﬂexibility and performance. Big-data

mining, analysis and display in a wide range of industrial sectors is made possible with

this choice.

The data locality of MongoDB instances is an appropriate answer to the needs of

O&G data management companies in terms of data. Not only does MongoDB handle

billions of documents, but it also sustains hundred of thousands of database atomic

operations per second, making it a suitable system for analyzing data. Since it?s also

multiplatform, MongoDB can be scattered all across the globe to unite important seis-

mic statistics and pieces of information.

Horizontal Scaling. All the unstructured collections of physical and digital data of the

O&G data management companies may be dispatched in structured sets of exploration,

drilling and production data. The data can then be split into different shards, meaning

there will be different MongoDB servers for different ranges of data. For instance, one

may divide the stores geographically and have non-overlapping immutable chunks for

each predeﬁned ?location? ﬁeld corresponding to each area. Considering the built-in

geospatial indexes in MongoDB querying system, exploring results of decades of tapes

end other capturing data is ensured to remain performant.

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

Theoretically,there is always the possibility to include a Hadoop frameworkto solve

storage and processing problematics in a distributed way. Computer clusters can thus

be accessed to run complex analytics and data processing with Map-Reduce jobs.

4.4 Application

In accordance to the diversity of available media with the Actinote 4.0 software, it’s also

worth mentioning that it consists on mobile devices of a Qt application, which enables a

good homogeneity of resulting behaviors on all platforms. The support of many features

such as camera, contact list or network connections are handled in the same way on

all platform and the compatibility on most devices (either on iOS and Android but

also BlackBerry 10 for instance) remains assured. Another positive consequence of this

choice is the integrated ergonomy of the OS: Qt framework adapts to the operating

system it is running on so that it can use the standard approach for each graphical

component. By doing so, the operators who are running the scenarios can keep the

devices they are used to work on and we don?t have to handle resistance to change.

Network of Stores. The structured sets of data are organized in a web of servers and

services which are all put together with the cloud computing procured by Actinote 4.0.

The uniformity contract of the sets at our disposal can be made practicable by including

converters and aggregators of data, or more generally ETL systems, all with the purpose

that they are reuniﬁed in the beforementioned big-data schemes. In practice, one will

ﬁrstly design a process, with the beneﬁt webservices and ETL invocation. Secondly,

the recorded knowledge will need to be digitized when no virtual save exists in com-

puter understandable formats. Last but not least, this restructured aggregation will be

merged and redistributed by means of sharding. This datamining process will maintain

the sporadic existence of data with but two main differences: the data will be normal-

ized so by construction rather easy to browse and the interface between all stores will

be speciﬁed to ensure every piece of information is obtainable on the network. The on-

tological approach of metrology subjects is a good start for interpreting the production

and exploration data which has been performed by Actimage[22].

4.5 Simple Oil & Gaz Implementation

We now present a simple example of the modules modiﬁcations implied by the use of

a speciﬁc domain knowledge. Lets imagine that O&G processes are composed, instead

of complex data management activities, of simply four different data manipulation op-

erations: search, format, compare and store.

Search is based on a webservice which returns the data from a selected world area. It

can return either numbers or string data depending on the world area (emulate different

data storages).

Format allows to format a data into the requested format. If the selected data is

already in the correct format, it simply does nothing.

Compress is able to compress a number data.

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

Store is used to create a uniformed storage. It requires that the data has been com-

pressed.

We will call this domain small O&G in this example.

DSL Additions. In order to handle small O&G processes, the editor requires to propose

to the user elements derived from our speciﬁcation.

The data manipulation operations are actions. A speciﬁc action for each of the oper-

ations is added in the DSL. Since these actions are obviously distinct we also propose to

create one type of activity for each operation. These activities will call the correspond-

ing actions when the user validate their execution.

The Search activity proposes the architect to enter the name of the area in the world

to search the data for and returns the result.

Format enables the architect to choose the variable to format, the expected output

format (strings or numbers) and the variable in which to store it. It requires the output

format and the variable types to be identical.

Compress let the architect choose the variable he wants to apply compression on. It

requires the variable to be a number.

Store enables the architect to select the variable to store. It also allows him to choose

a storage database to target. This parameter is shared between all stores.

The language also get two type of variables: string and numbers according to the

manipulated types by small O&G operations. Compressed data is not a type of data

because we consider since the compression is a non destructive operation.

The Expert System. Our expert system is fairly simple because the variable and activ-

ities typing validates most of the constraints brought by the domain semantic. Thought,

the compressed status of data being not inferred in the process model it is the expert

system role to handle it. Also, the type of the data returned by a search action has to

he modeled in the knowledge base in order to assert that search actions stores data in

corresponding type variable.

The knowledge base is then composed of facts concerning the search areas and rules

to verify that search and compress activities are correct according to the expert knowl-

edge. This knowledge being: search activity is always preceded by a compress activity

and store activities variable type and storage returned data must correspond. Listing

1 shows the prolog knowledge base. The transformation of process model to speciﬁc

domain model consumes a small O&G process model and generates a knowledge base

extension which contains the different activities, actions and variables facts:

Application Modiﬁcations. In order to be able to execute small O&G processes, the

application must be upgraded. It is mandatory to provide to the application the code

to execute when actions are executed. The GUI widgets composing the small O&G

activities are standards validation and data display widgets, hence there is no further

development required to adapt the application to the new speciﬁc domain.

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

Listing 1. Small O&G knowledge base.

dataArea(string, asia).

dataArea(string,europe).

dataArea(number,America).

dataArea(string,Affrica).

searchVerification(ID) :- searchActivity(ID, Area, VarID),

variable(VarID,Type),

dataArea(Type,Area).

storeVerification(StoreID) :- activityFlow(SourceID,StoreID),

compressActivity(SourceID,_).

storeVerification(StoreID) :- activityFlow(SourceID,StoreID),

storeVerification(SourceID).

Transformations. Both the process model to speciﬁc domain model and process model

to application model have to be modiﬁed. Indeed, the mapping source and target models

changed,hence they have to be augmented with the new actions, activities and variables.

Results. It is now possible to use the small O&G editor to create processes, verify them

and implement their behavior in a multi platform application. Figure 8 illustrates two

processes modeled with an editor. The ﬁrst one presents an error because there is no

compression activity before the store activity. The second one is the corrected version

of the ﬁrst process which is validated by the expert system. Please, notice that ﬁgure 8

is an illustration of the process model. It is not produced with the Actinote 4.0 current

editor.

This example shows that the addition of a domain speciﬁc in the platform induce

modiﬁcation in all the modules of the MDE approach. But, once these modiﬁcations

have been made once, it is possible to generate as many different processes based on

this speciﬁc domain asserting that they will be correct by construction. Moreover, as

stated before, the editor created becomes easy enough to let the end-user model his

processes himself.

5 Conclusion and Further Works

This paper presented that an established fact of the data in the O&G sector is that its in-

terpretation relies heavily on human skill and experience: seismic data can be huge (up

to hundreds of petabytes) and full of noise that needs to be manually cleaned. In order

to justify the goal of the DataPipe platform, met by cooperating with a variety of spe-

cialists in a European project context: to alleviate work that would still be performed by

human professionals. After a review of the different project stakeholders requirements,

this paper presented Actimage model driven engineering approach to fulﬁll them. The

paper then present an overview of the solution created to apply the approach: Actinote

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

Area : europe

Type : string

format

Type : number

store

Area : europe

Type : string

format

Type : number

store

compress

(A)

(B)

DB : local

Fig.8. small O&G processes models. (A) a model with a missing compress activity. (B) a vali-

dated model.

4.0 and how it is able to create, validate and implement model speciﬁc based data man-

agement processes.

Many products have been designed to solve the issues the Oil and gas industry is

facing. The adaptivity of the product being a factor of the user acceptance, it seems

therefore only clear that ﬁlling the gap between the users and the architect, as much

as the gap between the noisy content and the normalized format, is essential. Hetero-

geneousness of formats having been a major subject of the oil and gas ﬁeld for the

past decades, it has been settled in the DataPipe project that giving control to industry

specialists was the best approach to counter this environmental disparateness. The in-

teractive Actinote 4.0 platform is the result of cloud-based engineering in that it uses

adaptive behaviours to lower expectation differences between individuals and their de-

vices. This brings a ﬂexibility which can be perceived as a catalyst for the support of

diverse digital intelligence media. Besides, the seismic stores of contents are arranged

in a ubiquitous manner,hence an improvementof adjustability of data for both datamin-

ing and analysis purposes. The multiplatform aspects of the UI also mentioned in this

article play an important role to the business logic adaptation one can observe using

DataPipe software on mobile instruments or displays.

Datapipe project brought to Actimage knowledge in the expert system implemen-

tation and use. The size, context and complexity of the project proved to be a perfect

opportunity to explore the MDE domain and apply it in, not only generic, but domain

speciﬁc, user described, processes creation.

The interactions with the project partners allowed Actimage to acquire deep knowl-

edge on speciﬁc domains such as the Oil & Gas data management, the print and the

ﬁlm industries. The collaboration in a multicultural context like Datapipes one brought

some ideas and solutions that would never have rise otherwise.

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies

Further works implies a deeper relation between the partners to enhance the cur-

rent O&G knowledge database and then conﬁrm the presented approach scalability to

industrialisation. Another current experimentation is the implementation of other spe-

ciﬁc domain knowledge. Actimage currently works on a metrology based declination

of Datapipe. And we also expect to test the two domains combination.

References

1. J. Bielak and D. Steeb, “Abstraction of multiple-format geological and geophysical data for

oil and gas exploration and production analysis,” Feb. 16 1999. US Patent 5,873,049.

2. R. D. Miller, J. H. Bradford, K. Holliger, and R. B. Latimer, eds., Advances in near-surface

seismology and ground-penetrating radar. No. no. 15 in Geophysical developments series,

Tulsa, Okla. : Washington, D.C. : Denver, Colo: Society of Exploration Geophysicists ;

American Geophysical Union ; Environmental and Engineering Geophysical Society, 2010.

O. Yilmaz, S. M. Doherty, and

O. Yilmaz, Seismic data analysis: processing, inversion, and

interpretation of seismic data. No. no. 10 in Investigations in geophysics, Tulsa, OK: Society

of Exploration Geophysicists, 2nd ed ed., 2001.

4. N. Krichene, “World crude oil and natural gas: a demand and supply model,” Energy Eco-

nomics, vol. 24, no. 6, pp. 557 – 576, 2002.

5. CDW, “High-performance computing in oil and gas,” tech. rep., CDW, 2014.

6. HP, “Hp workstations for seismic interpretation,” tech. rep., Hewlett-Packard Development

Company, 2013.

7. M. Arenaz, J. Dominguez, and A. Crespo, “Democratization of hpc in the oil & gas industry

through automatic parallelization with parallware,” in 2015 Rice Oil&Gas HPC Workshop,

2015.

8. S. Mellor, A. Clark, and T. Futagami, “Model-driven development - Guest editor’s introduc-

tion,” IEEE Software, vol. 20, pp. 14–18, Sept. 2003.

9. Y. Lamo, X. Wang, F. Mantz, ø. Bech, A. Sandven, and A. Rutle, “DPF Workbench: a multi-

level language workbench for MDE,” Proceedings of the Estonian Academy of Sciences,

vol. 62, no. 1, p. 3, 2013.

10. K. Czarnecki and S. Helsen, “Feature-based survey of model transformation approaches,”

IBM Systems Journal, vol. 45, no. 3, pp. 621–645, 2006.

11. A. Metzger, “A systematic look at model transformations,” in Model-driven Software Devel-

opment, pp. 19–33, Springer, 2005.

12. J. Hutchinson, J. Whittle, and M. Rounceﬁeld, “Model-driven engineering practices in in-

dustry: Social, organizational and managerial factors that lead to success or failure,” Science

of Computer Programming, vol. 89, pp. 144–161, Sept. 2014.

13. M. Torchiano, F. Tomassetti, F. Ricca, A. Tiso, and G. Reggio, “Relevance, beneﬁts, and

problems of software modelling and model driven techniques—A survey in the Italian in-

dustry,” Journal of Systems and Software, vol. 86, pp. 2110–2126, Aug. 2013.

14. OMG, “MDA Guide 1.0.1,” 2003.

15. M. Voelter and S. Benz, eds., DSL engineering: designing, implementing and using domain-

speciﬁc languages. Lexington, KY: CreateSpace Independent Publishing Platform, 2013.

16. WFMC, “Terminology and glossary,” Tech. Rep. WFMC-TC-1011, Issue 3.0, Workﬂow

Management Coalition, Feb. 1999.

17. C. Morley, Processus m´etiers et syst`emes d’information: ´evaluation, mod´elisation, mise en

oeuvre. Paris: Dunod, 2007.

18. W. Aalst, A. Barros, A. Hofstede, and B. Kiepuszewski, “Advanced workﬂow patterns,” in

Cooperative Information Systems (P. Scheuermann and O. Etzion, eds.), vol. 1901 of Lecture

Notes in Computer Science, pp. 18–29, Springer Berlin Heidelberg, 2000.

Datapipe: A Conﬁgurable Oil Gas Automated Data Processor

19. W. Aalst, A. Hofstede, B. Kiepuszewski, , and A. Barros, “Workﬂow patterns,” Distributed

and Parallel Databases, vol. 14, no. 1, pp. 5–51, 2003.

20. P. Jackson, Introduction to expert systems. Addison-Wesley Pub. Co.,Reading, MA, Jan.

1986.

21. A. Eardley and L. Uden, eds., Innovative knowledge management: concepts for organiza-

tional creativity and collaborative design. Hershey, PA: Information Science Reference,

2011.

22. F. Bourgeois, P. Studer, B. Thirion, and J.-M. Perronne, “A Domain Speciﬁc Platform for

Engineering Well Founded Measurement Applications,” in Proceedings of the 10th Interna-

tional Conference on Software Engineering and Applications, pp. 309–318, 2015.

EPS Colmar 2015 2015 - The Success of European Projects using New Information and Communication Technologies