design decisions. Several replications have been car-
ried out in this way, contributing to a growing body
of knowledge about ML engineering techniques. By
employing the graph architecture, KGs are capable of
modeling a range of relationship types (edges) and
entities (nodes) (Chen et al., 2020). KGs comprise
an additional embedded layer, designated a reasoner
(or inference engine), which enables them to extract
implicit information from existing explicit concepts,
in contrast to plain graph or non-relational databases.
The most well-known examples of knowledge graphs
(KGs) – DBpedia, Freebase, Wikidata, YAGO, and so
forth – encompass a diverse array of domains and are
either derived from Wikipedia or created by volunteer
communities (Heist et al., 2020). The Google Knowl-
edge Graph is one of the largest and most comprehen-
sive KGs in existence, aiming to model and link all
structured information found on the internet, includ-
ing persons, organizations, skills, events, products,
and more. This is one of the reasons why the Google
search engine is so effective. A graph-based knowl-
edge representation and reasoning formalism derived
from conceptual graphs has been formalized as finite
bipartite graphs, as outlined in (Mugnier and Chein,
1992). In this formalism, the set of nodes is divided
into concept and conceptual relation nodes. In such
a graph, concept nodes represent classes of individu-
als, and conceptual relation nodes illustrate the rela-
tionships between the aforementioned concept nodes.
This is in accordance with the findings of (Sowa,
1976). As outlined in (Ehrlinger and W
¨
oß, 2016),
a KG acquires information and integrates it into an
ontology, subsequently applying a reasoner to derive
new knowledge. Furthermore, in accordance with
the definition provided by (Ji et al., 2021), KGs are
”structured representations of a fact, consisting of en-
tities, relations, and semantics.” Entities may be either
real-world objects or abstract concepts. Relationships
represent the relationship between entities, and se-
mantic descriptions of entities and their relationships
contain types and properties with defined semantics.
Property graphs, in which nodes and relations possess
properties or attributes, or attribute graphs, are exten-
sively employed. All of these facets rely on a knowl-
edge inference over knowledge graphs, which repre-
sents one of the core technologies in the design of our
ML engineering BoK. The Semantic Web community
has reached a consensus on the use of RDF to repre-
sent a knowledge graph. Then, RDF model also al-
lows for a more expressive semantics of the modeled
data that can be used for knowledge inference. As a
result, a KG is a set of interconnected information on
a specific set of facts that includes characteristics of
many data management paradigms:
• Database: Structured queries can be used to ex-
plore data in a database.
• Graph: KGs can be analyzed in the same way that
any other network data structure can be.
• Knowledge Base: Formal semantics are encoded
in KGs, which can be used to understand data and
infer new facts.
3.6 Step 5: Knowledge Query
In this context, a body of knowledge (BoK) is con-
ceptualized as a graph of knowledge, as proposed
by (Mattioli et al., 2022). Ultimately, the utility of the
ingested, transformed, integrated and stored knowl-
edge is contingent upon the efficiency with which an-
swers can be retrieved by users in an intuitive man-
ner. At the present time, keyword queries and spe-
cialized query languages (e.g. SQL and SPARQL)
represent the prevailing approaches to information re-
trieval. However, in order to facilitate the search for
a specific ML engineering knowledge by querying
the KG and selecting the set of relevant engineering
views to perform specific ML engineering activities,
it is necessary to enable the identification of simi-
larities between Confiance.ai documents by search-
ing for isomorphisms between the graphs represent-
ing the knowledge extracted from the text. A num-
ber of algorithms have been defined which implement
subgraph isomorphism; however, the subgraph iso-
morphic problem is an NP-complete problem. The
initial component is a generic sub-graph matching
mechanism that functions in conjunction with fusion
schemes. This component is responsible for ensur-
ing the structural consistency of the merged informa-
tion with respect to the structures of the initial docu-
ments throughout the fusion process. The fusion ap-
proach is constituted by the similarity and compatibil-
ity functions applied to the members of the graphs to
be fused. The generic fusion algorithm can be adapted
to suit the context in which it is used by adopting these
strategies. The knowledge graph fusion method offers
two additional operations, contingent upon the fusion
strategies employed. Information synthesis is the col-
lection and organization of data on a subject. Infor-
mation is then put together into a network through
information synthesis, where any repetitions are re-
moved. Fusing techniques are used to combine in-
formation about the same thing, even though it is in
different forms. When different sources of informa-
tion are used to create a representation of something,
inconsistencies may appear. This function finds all
the information in a network that follows a specific
pattern. The structure of the query graph must match
that of the data graph. To find the information query
KMIS 2024 - 16th International Conference on Knowledge Management and Information Systems
336