sequently their reports are delivered in different for-
mats, both in content and in form. Therefore, concep-
tual modeling, one of the most important steps in a
database (Elmasri and Navathe, 2011), should be per-
formed for each item in the Databook, because in this
step requirements are analyzed to define the system
structure.
This modeling is done manually by a professional,
based on his experience, in the domain of the sys-
tem and certain rules. However, this task becomes
difficult when large systems are developed. For the
present work, a system is proposed to automatically
generate a conceptual model from textual descriptions
of the inspection elements, making this task simpler.
This is done from the development of an ontology that
describes the standardized knowledge on inspection
of different types of components of the structure of
works for oil platforms for the validation of inspec-
tion reports and quality certificates regarding criteria
of completeness and compliance. Ontology makes it
possible to deal with the diversity of types of com-
ponents, documents and the correct allocation of data
even under different structures. In this way, it is pos-
sible to resolve the match between the different data
schemes associated with each of the inspected com-
ponents and structures. Through this approach, data
can be inserted into the same database.
This work is part of a larger project that aims to
use pre-processing techniques and data science for the
automatic verification of completeness, compliance
and bad faith in records and documents of purchase,
construction and assembly in context Big Data. The
project involves the development of technologies for
registration, retrieval and inference of large databases
using technologies of datalakes and microservices.
2 FUNDAMENTALS
2.1 Component Inspection for Oil and
Gas Industry
Within the scope of the oil industry in Brazil, the
document that governs the guidelines related to the
inspection of the manufacture of equipment for oil
and gas is the ABC of the Inspection of Manufacture
(Petr
´
oleo-Brasileiro-S.A, 2017).
According to this document, manufacturing in-
spection ”is the activity carried out for planning and
execution purposes aiming to verify, at the premises
of the supplier and / or sub-suppliers involved, the
conformity of the equipment or materials manufac-
tured with the contractual documents”. An asset is
defined as any system, equipment, product or mate-
rial that the Supplier must deliver to the customer in
accordance with the contract.
2.2 Ontology
In the area of Computer Science, ontologies are used
for modeling, both in database-based systems and for
knowledge representation (Almeida, 2014). An on-
tology allows the definition of concepts (entities, ob-
jects, events, processes), emphasizing their proper-
ties, relationships and restrictions, enabling the shar-
ing of knowledge about a given domain, which is rep-
resented by a vocabulary (de Freitas, 2017).
According to (Almeida, 2014), two main defini-
tions for ontology in Computer Science are noted in
the literature. The first refers to the use of ontolog-
ical principles to model reality, that is, to provide a
description of what exists and to characterize entities
(Wand et al., 1999). The other definition refers to
the representation of a domain in a logical language
for computational purposes. In this case, an ontol-
ogy consists of a set of statements expressed through
a representation language, processable by inference
mechanisms (Staab and Studer, 2010).
2.3 Schema Matching
Scheme matching is the problem of generating corre-
spondences between elements of two schemas (Bern-
stein et al., 2011). A match is a relationship between
one or more elements of a scheme and one or more
elements of the other.
Figure 1 shows the simplified representation of
two schemes and the matches (or correspondences)
found in dotted lines. The problem proves to be chal-
lenging because the matching task may contain ele-
ments described differently (for example an acronym)
and still be matched (PurchaseOrder and PO), ele-
ments whose semantics are partially matched (Name
representing the full name and FName only the first
name), elements with similar names but representing
different concepts in the schemes (Address represent-
ing the billing address and ShipAddress representing
the delivery address) and elements without a corre-
spondent in the other scheme ( Product, BillTo and
Customer).
2.4 Generation Database Schema from
an OWL Files
In (Panawong et al., 2016) a method is presented to
automatically generate a database schema from an
OWL (Ontology Web Language) file. In addition,
ICINCO 2021 - 18th International Conference on Informatics in Control, Automation and Robotics
212