Parallelism in the Generation of Concepts Through the Formal Context
Object Partitioning Using the In-Close 4 Algorithm
Andr
´
e Alves, Luis Zarate, Henrique Freitas and Mark Song
Instituto de Ci
ˆ
encias Exatas e Inform
´
atica - Pontif
´
ıcia Universidade Cat
´
olica de Minas Gerais, Brazil
Keywords:
FCA, Parallelism, Concept Generation, In-Close 4.
Abstract:
Formal Concept Analysis (FCA) usage for information extraction has been increasingly recurrent, using al-
gorithms already developed for this purpose. However, the computational effort involved in analyzing and
extracting information can drastically increase in the case of dense and high-dimensionality databases. The
main goal of this work is to present a parallel approach to solving this problem. We propose parallelizing
In-Close 4 with C++ and OpenMP to generate formal concepts in multiple threads. Our approach is based on
the subdivision of formal contexts into subcontexts grouped by a single set of objects. In addition, we propose
a set of operations to obtain the original formal concepts from the quasi-concepts, concepts generated from the
subcontexts. Our results show a speedup of up to 1.4116x with an efficiency of 70.48% for 100,000 objects,
50 attributes, and a density of 50%.
1 INTRODUCTION
The quest for knowledge has become more accessible
as technology improves over time. As a result, new
information search mechanisms have emerged with
increasing frequency, and existing ones are becoming
more efficient. The internet is the biggest provider
of this data, which has generated the ambition of the
scientific community to facilitate the acquisition of
knowledge on this platform (Godinho et al., 2022).
Since there are huge data repositories on the inter-
net and it would be unfeasible for a human to process
them due to the need for prior knowledge to define
the relevance of the information, several data min-
ing techniques have been proposed to obtain knowl-
edge (Domingos-Silva and Vieira, 2008). The For-
mal Concept Analysis (FCA) stands out among these
techniques, focusing on the relationships between the
elements (Carpineto and Romano, 2004).
FCA is a branch of mathematics created for data
analysis through associations and dependencies of ob-
jects and attributes formally described from a real, or
a synthetic dataset (Ganter and Wille, 2012).
To represent these data, the Formal Context con-
sists of a set of attributes and objects and a set of Inci-
dences, which consist of the pair between a subset of
objects and attributes.
Through the contexts, it is possible to obtain the
formal concepts and organize them hierarchically in
formal lattices to discover knowledge by providing
an explicit hierarchy between the concepts in the
database.
FCA for data analysis is widely used in several
fields of knowledge, such as psychology, artificial in-
telligence, biology, and linguistics (Priss, 2006; Jindal
et al., 2020). The reason is due to FCAs characteris-
tic of hierarchically representing the relationships be-
tween the datasets to be analyzed.
However, the computational effort involved in an-
alyzing and extracting information can drastically in-
crease in the case of dense and high-dimensionality
databases. Therefore, the extraction of the concepts
for those scenarios becomes unfeasible.
In this way, several techniques have emerged to
reduce the complexity of the generation of these con-
cepts. An example is parallelism, such as hybrid ar-
chitectures for performance improvements in algo-
rithms that use FCA approaches (de Moraes et al.,
2016; Chunduri and Cherukuri, 2019; Novais et al.,
2021).
A possible alternative to generating formal con-
cepts is based on the subdivision of the set of ob-
jects. Then, smaller subsets can be processed in par-
allel threads.
Thus, the main goal of this work is to present an
approach focused on reducing the In-Close 4 algo-
rithm execution time. Our proposal was developed
in C++ and OpenMP to generate formal concepts in
Alves, A., Zarate, L., Freitas, H. and Song, M.
Parallelism in the Generation of Concepts Through the Formal Context Object Partitioning Using the In-Close 4 Algorithm.
DOI: 10.5220/0011991500003467
In Proceedings of the 25th International Conference on Enterprise Information Systems (ICEIS 2023) - Volume 2, pages 195-202
ISBN: 978-989-758-648-4; ISSN: 2184-4992
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
195
parallel from the subdivision of formal contexts into
subcontexts. Each subcontext is a set of objects and
represents a thread workload. As part of the solution,
a set of operations was also proposed to make it pos-
sible to obtain the original formal concepts from the
quasi-concepts, i.e., concepts generated from the sub-
contexts.
The execution times of the In-Close 4 algorithm in
its serial version and our parallel proposal were eval-
uated. For instance, we achieved a speedup of up to
1.4116x with an efficiency of 70.48% for 100,000 ob-
jects, 50 attributes, and a density of 50%.
Thus, our main contribution to state-of-art is a
new approach for generating formal concepts us-
ing quasi-concepts, targeting parallelism exploitation.
We also contribute with a parallel code using C++ and
OpenMP, which is available for download
1
.
This paper is organized as follows: Section 2
shows a background with some important concepts;
Section 3 describes the methodology; Section 4
presents and discusses the experiments and results;
Section 5 shows related work, and Section 6 discusses
some threats to validity and limitations. Finally, Sec-
tion 7 presents the conclusions and future works.
2 BACKGROUND
2.1 Formal Concept Analysis
Formal concept analysis is the branch of applied
mathematics that studies the mathematization of con-
cepts and their representation in a hierarchical way,
called the conceptual lattice.
The first known appearance was in 1982. The
author proposed considering lattice elements as con-
cepts to establish a common language in the lattice
area (Ganter et al., 2005).
In its most classical approach, better known as
dyadics, a formal context is expressed as a triple
K = (G, M, I), which G represents a group of objects,
M represents a group of attributes and I represents a
binary relationship between G and M, indicating that
an object g G has an attribute m M. Table 1 rep-
resents a dyadic context with |G| = 8, |M| = 9 and
| I| = 33.
A formal concept of a formal context, expressed
by K = (G, M, I), is defined by a pair (A, B), where
A G, B M. From this, we have that the pair A, B
follows the conditions in which A = B
0
and B = A
0
,
with (
0
) representing a derivation operator, defined as:
1
https://github.com/alvesandre/ConceptsFinder
Table 1: Formal dyadic context.
needs water to live
lives in the water
lives on land
needs chlorophyll
dicotyledon
monocot
can move
has limbs
breastfeed
leech X X X
bream X X X X
frog X X X X X
dog X X X X X
weeds X X X X
reed X X X X X
beans X X X X
corn X X X X
A
0
= {b M | aIb a G} (1)
B
0
= {a G | aIb b M} (2)
The set of objects A is defined as the extension
of a formal concept - all objects present in A contain
all attributes present in B. The set of attributes B is
defined as the intention of a formal concept - all at-
tributes present in B that belong to the set of objects
A.
As an example from Table 1, we can visualize the
pair (A, B) defined by {”frog”, ”reed”}, {”needs wa-
ter to live”, ”lives in the water”, ”lives on land”}) as
a formal concept, since A
0
= B and B
0
= A. Table 2
shows some formal concepts related to the data in Ta-
ble 1.
Table 2: Formal concepts related to Table 1.
Extension Intent
Concept 1 {weeds, reeds, corn} {needs water to live,
needs chlorophyll,
monocots}
Concept 2 {frog, reeds} {needs water to live,
lives in the water,
lives on land}
Concept 3 {leech, bream, frog,
dog, weed, reed,
bean, corn}
{needs water to live}
Concept 4 {bream, frog} {needs water to live,
lives in the water, can
move, has limbs}
Concept 5 {bream, frog, dog} {needs water to live,
can move, has limbs}
ICEIS 2023 - 25th International Conference on Enterprise Information Systems
196
2.2 Synthetic Formal Contexts
This work used several synthetic formal contexts to
guarantee a better exploration of the algorithm func-
tionality.
We use the SCGaz, a tool proposed by Rimsa et
al. (Rimsa et al., 2013) to create synthetic formal con-
texts. The tool is capable of generating random con-
texts with unique objects and custom sizes, dimen-
sions, and densities.
2.3 Concept Extraction Algorithms
There are currently several algorithms in the litera-
ture to extract the concepts from the generated for-
mal contexts. Among them, two algorithms stand out:
In-Close(Andrews, 2009) and Data-Peeler(Cerf et al.,
2008).
In-Close(Andrews, 2009) uses incremental clo-
sures and matrix searches to extract all formal con-
cepts that adhere to a formal context. It used an im-
plicit search lexicographic approach to avoid the com-
putational overhead for repeated closures. Closing is
completed incrementally and only once per concept as
it walks through the attributes. This processing con-
tinues, adding a new attribute and pruning branches
until the concept is closed (Andrews and Orphanides,
2010).
Data-Peeler(Cerf et al., 2008) is an algorithm that
uses in-depth searches. It works using two parame-
ters: the current node (N) and its related stack (P).
The algorithm starts from the root node and uses an
empty stack. After that, the proximity property is
checked, as well as a monotonic constant C defined
by the user. If both conditions are met, the n-set U
will be output as long as there are no more elements
to enumerate or if the enumeration process continues
by splitting the current node N into two new nodes.
Each node N in the enumeration tree is a pair (U,
V ) where U and V are two n-sets. N represents all n-
sets, which contain all the elements of U and a subset
of the elements of V . In other words, this is the n-set
search space. The root node represents all possible
n-sets.
2.4 Parallel Computing
Parallel computing (Pacheco, 2011; Robey and
Zamora, 2021) is a form of computation in which sev-
eral calculations are performed simultaneously, acting
on the premise that large problems can usually be di-
vided into smaller ones so that they can be solved in
the same amount of time (in parallel). This technique
can be used in image processing and climate model-
ing, among other areas (Quinn, 2003; Linden, 2008).
Therefore, using processing cores to execute mul-
tiple tasks can give us higher performance. In addi-
tion, parallel computing tools can improve compu-
tational resources, dividing a larger task into several
sub-tasks and executing them simultaneously in sev-
eral cores.
The OpenMP API (Mattson et al., 2019; Van
Der Pas et al., 2017) is an alternative to exploit par-
allelism on shared-memory machines. Besides, it is
possible to use OpenMP for heterogeneous comput-
ing, which includes Central Processing Unit (CPU)
and, e.g., Graphics Processing Unit (GPU) and Intel
Xeon Phi.
3 METHODOLOGY
The context used to execute some simulations pre-
sented in this work is available with the executable
of the In-Close algorithm
2
. SCGaz tool was also used
to create synthetic contexts used in the tests (Rimsa
et al., 2013). It is also important to point out that the
algorithm chosen for extracting formal concepts was
In-Close, in version 4, due to the ease of handling and
parameterization since it allows the customization of
a minimum number of attributes and objects.
The language used to develop the algorithm used
in the tests was the C++ language, with some wrap-
pers to execute tests automatically. The OpenMP API
was also used to exploit multithreading parallelism.
The algorithm is available on the GitHub website
3
.
3.1 Thread Parallelization
For parallelization, the OpenMP directives were used
to execute threads in parallel. The number of threads
was defined by the number of divisions for each con-
text. OpenMP was also used to generate synthetic for-
mal contexts in parallel.
3.2 Generation of Quasi-Concepts and
Formal Concepts
A formal context of example present in the selected
algorithm for the analysis was used to generate the
formal concepts used in this work. After that, the for-
mal concepts of this context were extracted using the
In-Close 4 algorithm, using the following parameters:
Minimum number of attributes: 0
2
https://upriss.github.io/fca/examples.html
3
https://github.com/alvesandre/ConceptsFinder
Parallelism in the Generation of Concepts Through the Formal Context Object Partitioning Using the In-Close 4 Algorithm
197
Minimum number of objects: 0
Output format: JSON file (JavaScript Object No-
tation)
The minimum number of attributes and objects
was chosen as the minimum possible for the algo-
rithm to extract as many formal concepts as possible.
The output format chosen was JSON due to the conve-
nience of handling these types of files. These formal
concepts were generated to validate the formal con-
cepts generated from the quasi-concepts.
Then, the formal context was partitioned to subdi-
vide it into subcontexts - dividing them into subsets of
objects and keeping the attributes and their respective
incidences.
After that, concepts were extracted from each for-
mal subcontext. Each concept obtained is called a
quasi-concept. This may or may not become a con-
cept when considering the formal context as a whole.
Therefore, a new thread was used for each subcontext
(a subdivision of formal context) to generate quasi-
concepts executed in parallel.
It is important to emphasize that the operations
will always be performed between two groups of
quasi-concepts and whenever we execute the opera-
tions where the number of quasi-concept groups is
greater than two, the first operation will be performed
between the first two groups, but the next ones will be
made between the previous result and the next group.
3.3 Operations
Consider K(G, M, I) a context subdivided in subcon-
texts K
i
(G
i
, M, I
i
), 1 i n, where G
p
G
q
=
/
0
p 6= q and G
1
G
2
... G
n
= G. Note that G
i
is a
partition of G.
Assuming that (A
p
, B
p
) and (A
q
, B
q
) are quasi-
concepts obtained from subcontexts K
p
and K
q
.
Based on the quasi-concepts generated from each
subcontext K
i
, it is necessary to perform the following
operations to obtain the final concepts:
1. If B
p
B
q
=
/
0 then the quasi-concepts are
concepts.
Proof: As B
p
B
q
=
/
0 then B
0
p
= A
p
and A
0
p
=
B
p
for the whole context K. The same idea applies to
any (A
q
, B
q
).
Table 3 shows an example for context K. To sim-
plify it, we use two subcontexts K
1
and K
2
, but the
same approach is easily extended to n subcontexts.
Table 3: Example of quasi-concepts that are concepts.
a b c
K
1
1 X X
2 X
3 X X
K
2
4 X
5 X
6 X
It is possible to see that ({1, 2, 3}, {a}) and ({1,
3}, {a, b}) are quasi-concepts obtained from subcon-
text K
1
. Also, ({4, 5, 6}, {c}) is a quasi-concept ob-
tained from subcontext K
2
.
As {a} {c} =
/
0, then ({1, 2, 3}, {a}) is also a
concept for the whole context K since {a} = {1, 2,
3}, and {1, 2, 3}’ = {a} considering all context K.
The same idea applies for ({1, 3}, {a, b}). Since
{a, b} {c} =
/
0, then ({1, 3}, {a, b}) is also a concept
for the whole context K since {a, b} = {1, 3}, and {1,
3}’ = {a, b} considering all context K.
Thus, ({4, 5, 6}, {c}) is also a concept for the
same reason.
2. If B
p
B
q
6=
/
0 then the set of attributes B
p
B
q
is common to all set of objects A
p
and A
q
, so there is
a formal concept with objects A
p
A
q
and attributes
B
p
B
q
.
Proof: Consider X = A
p
A
q
and Y = B
p
B
q
. If
B
p
B
q
6=
/
0 = Y
0
= X. But X
0
= Y, so (A
p
A
q
,
B
p
B
q
) is certainly a concept of K.
Table 4 shows an example of 2 subcontexts of a
context K. The same approach is easily extended to n
subcontexts.
Table 4: Example for intersection.
a b c
K
1
1 X X X
2 X X
3 X X
K
2
4 X X X
5 X X
6 X
In the example, it can be seen that from the sub-
context K
1
, the quasi-concept ({1}, {a, b, c}) is ob-
tained. From the second subcontext K
2
, it is possible
to obtain the quasi-concept ({4}, {a, b, c}).
We can infer a concept from these two quasi-
concepts, since both share the set of attributes {a,b,c}.
Therefore, it is possible to obtain the formal concept
({1, 4}, {a, b, c}), since the intersection of the sets of
ICEIS 2023 - 25th International Conference on Enterprise Information Systems
198
intentions resulted in {a, b, c} and the union of the ex-
tension set {1} {4} is equal to {1, 4}.
It is important to note that, from the subcontext
K
1
, the quasi-concept ({1, 2, 3}, {a, b}) is obtained.
From the second subcontext K
2
, it is possible to obtain
the quasi-concept ({4, 5, 6}, {a}).
We can infer a concept from these two quasi-
concepts since both share the set of attributes {a}.
Therefore, it is possible to obtain the formal concept
({1, 2, 3, 4, 5, 6}, {a}), since the intersection of the
sets of intentions resulted in {a} and the union of
the extension set {1, 2, 3} {4, 5, 6} is equal to
{1, 2, 3, 4, 5, 6}.
3. If B
p
B
q
then the quasi-concepts are subcon-
cepts of a concept in context K. To perform this ac-
tion, check if one of the attribute sets is a subset and
if the set of objects of a quasi-concept is a subset of
objects of an extracted concept. Table 5 shows an ex-
ample.
Table 5: Example for quasi-concepts that are not concepts.
a b c
K
1
1 X X
2 X X X
3 X X
K
2
4 X X X
5 X X
6 X X
It can be observed that in the first subcontext K
1
there is a quasi-concept ({1, 2, 3}, {a, c}) and in the
second subcontext K
2
there is another quasi-concept
({4, 5, 6}, {a, c}). However, both are not concepts,
as they share the intention {a, c} that is included in a
concept ({1, 2, 3, 4, 5, 6, }, {a, c}) obtained from rule
2.
4. Supremum and Infimum
To calculate the supremum and infimum, when
they are not generated as quasi-concepts, it is neces-
sary to define the infimum as the set of objects that
share all attributes and the supremum as the set of at-
tributes that share all objects:
If there is no infimum set already defined, then In-
fimum = (
/
0, {b
1
... b
n
}), where b
1
.. b
n
are all
attributes of context K.
If there is no supremum set already defined, then
Supremum = ({a
1
... a
n
},
/
0), where a
1
.. a
n
are
all objects of context K.
4 RESULTS AND DISCUSSION
Before executing the In-Close 4 algorithm, a normal-
ization of values was performed so that its execution
was possible.
Normalization consists of removing the words ob-
ject and attribute and placing the corresponding letter
or index. For objects, we assign a set of numbers,
and for attributes, a set of letters. Ex: ob ject[0] 1
attribute[5] F
To perform the tests, 100 synthetic contexts were
generated using SCGaz, varying the database with
10,000, 25,000, 50,000, and 100,000 objects, 50 at-
tributes, and 50% of density.
Once the formal contexts were generated, they
were subdivided into 2, 4 and 8 subgroups, to affirm
that the entire generation of formal concepts is possi-
ble. Finally, ve different formal contexts were gen-
erated for each dataset to guarantee full coverage of
the tests.
Our tests were executed on a machine with an 8-
core Mac M1 processor and 8 GB of main memory.
The operating system used is a MacOS Mojave. In all
tests performed, the formal concepts generated from
the quasi-concepts, separating them into subsets, were
equivalent to the original set of formal concepts.
The performance evaluation is based on the exe-
cution times of the serial version of In-Close 4 and
our parallel proposal in C++/OpenMP.
Table 6 shows the results of a sub-division of
the formal context (10,000x50) into subcontexts, i.e.,
the number of threads. The highest speedup
4
was
1.4246x for two threads, representing an efficiency
5
of the parallel version of 71.23% in comparison to the
serial In-Close 4.
It is important to notice that when we increase the
dataset size, this execution time increases due to the
large number of incidences to be analyzed. Another
point to consider is that there was a better result when
subdivided into two groups (two threads).
Table 6: Results for 10,000 objects and 50 attributes.
# Concepts
In-Close 4
Time (s)
Parallel approach
# Threads Time (s)
282,115 0.4781
2 0.3356
4 0.3374
8 0.359
Table 7 presents a formal concept of the same
number of attributes, but the number of objects pro-
vided increased to 25,000. For this context, the high-
est speedup (two threads) was 1.4250x with an effi-
ciency of 71.25%.
In Table 8, although the number of objects in-
4
Speedup = serial time / parallel time
5
Efficiency = speedup / number of cores
Parallelism in the Generation of Concepts Through the Formal Context Object Partitioning Using the In-Close 4 Algorithm
199
Table 7: Results for 25,000 objects and 50 attributes.
# Concepts
In-Close 4
Time (s)
Parallel approach
# Threads Time (s)
298,512 1.7656
2 1.239
4 1.251
8 1.323
creased to 50,000, the highest speedup can be ob-
served with two divisions (two threads), which was
1.4943x. The efficiency represents 74.71%.
Table 8: 50,000 objects and 50 attributes.
# Concepts
In-Close 4
Time (s)
Parallel approach
# Threads Time (s)
428,215 9.2264
2 6.17445
4 6.89105
8 7.41605
In Table 9, the number of objects was increased to
100,000 with higher performance compared to serial
In-Close 4. The highest speedup for this scenario (two
threads) was 1.4116x with an efficiency of 70.48%.
Table 9: 100,000 objects and 50 attributes.
# Concepts
In-Close 4
Time (s)
Parallel approach
# Threads Time (s)
582,219 21.7264
2 15.39105
4 16.4178
8 17.07075
In Figures 1 to 4, we see charts representing
the performance as the number of divisions (threads)
increased. The x-axis is the number of divisions
(threads), and the y-axis is the execution time for both
algorithms (serial In-Close 4 with 1 thread and our
parallel version with 2, 4, and 8 threads). These fig-
ures show that we have scalability when the number
of objects increases. However, more than two threads
do not scale well for the same workload.
Figure 1: Execution time for 10,000 objects, 50 attributes
and 0.5 of density.
5 RELATED WORK
Nilander (de Moraes et al., 2016) proposed a paral-
lel implementation of the NextClosure algorithm for
its optimization, resulting in a performance gain by
reducing the execution time. Nilander’s idea was
Figure 2: Execution time for 25,000 objects, 50 attributes
and 0.5 of density.
Figure 3: Execution time for 50,000 objects, 50 attributes
and 0.5 of density.
Figure 4: Execution time for 100,000 objects, 50 attributes
and 0.5 of density.
proposed due to the large execution time demanded
when executing the algorithm using high-dimensional
databases. As a result, Nilander states that a consid-
erable speedup gain was obtained from the algorithm
but limited to the number of 16 concurrent threads.
Kodagoda et al. (Kodagoda et al., 2017) proposed
a parallel version of the In-Close algorithm to reduce
the running time of version 3 of the algorithm, us-
ing the OpenMP API. The authors claim that algo-
ICEIS 2023 - 25th International Conference on Enterprise Information Systems
200
rithms based on CBO (Colliding Bodies Optimiza-
tion) that are recursive by nature can easily be paral-
lelized, which has become one of the pillars of mak-
ing the project viable. A point to emphasize is using
the OpenMP, which can be exploited to parallelize
other algorithms. As a point of improvement, the au-
thors propose that more comparisons be made with
other parallel algorithms.
Novais et al. (Novais et al., 2021) proposed an al-
gorithm for extracting formal concepts based on the
OpenCL algorithm to obtain better performance for
extracting them. The algorithm uses a parallel brute
force approach, using a heterogeneous architecture
(CPU+GPU and CPU+FPGA
6
). According to Novais
et al., the approach obtained a performance 18x bet-
ter, in the best case, than the processing carried out by
the Data-Peeler algorithm in the same context. They
also highlighted that there were results in better en-
ergy efficiency since, in the best case, they obtained
a gain of 1.79 operations performed by energy con-
sumption compared to other algorithms in the differ-
ent architectures the project covers.
Fu and Nguifo (Fu and Nguifo, 2004) also pro-
posed an algorithm for obtaining concepts from
search spaces in parallel. In such a proposal, the re-
dundant attributes are removed from the formal in-
put context before the partition generation step to re-
duce the algorithm’s workload in extracting formal
concepts. Regarding another important feature of the
algorithm proposed by the author, it is based on the
NextClosure algorithm to obtain the collection of con-
cepts effectively. Furthermore, the author’s approach
does not consider balancing the workload, which is
one of the limiting factors of the adopted strategy. In
this work, the author concludes that parallel proposals
can handle high-dimensional input contexts. But, due
to unbalanced workloads, the algorithm may present
sudden performance variations according to the se-
lected load factor.
Zhou et al. (Zhou et al., 2021) proposed an im-
proved version of In-Close 4, the so-called In-Close5.
The new algorithm stores a concept’s context and ex-
tension as a vertical bit matrix. Within the In-Close 4
algorithm, the context is stored just as a horizontal bit
array, which is very slow in finding the intersection
of two span sets. According to the authors, the ex-
perimental results show that the proposed algorithm
is much more effective than the In-Close 4 algorithm,
working even in contexts where In-Close 4 does not
produce results.
In summary, Kodagoda et al. (Kodagoda et al.,
2017) created a parallel version of In-Close 3. Zhou
et al. (Zhou et al., 2021) work also modified the orig-
6
Field-Programmable Gate Array (FPGA)
inal algorithm, creating a new version called In-Close
5. Nilander, Novais et al., Fu and Nguifo (de Moraes
et al., 2016; Novais et al., 2021; Fu and Nguifo, 2004)
used different algorithms and architectures, such as
GPU and FPGA. Our work differs from the others
since we developed an approach that uses a wrapper
to run In-Close 4 algorithm.
6 THREATS TO VALIDITY AND
LIMITATIONS
We discussed the potential for parallelization in gen-
erating formal concepts by dividing their context. The
experimental results showed that parallelization could
be achieved in concept generation.
However, this paper focuses on working with con-
trolled synthetic datasets with the same number of
attributes (50) and density (50%), but with different
numbers of objects from 10,000 to 100,000. New re-
sults with a greater diversity of densities and attributes
are already being analyzed.
We observed possible overheads in the execution
of the proposed operations since several loops are
made to verify objects and attributes. In addition,
quasi-concepts must be joined, representing an ad-
ditional computation. So, we are evaluating how to
reduce overheads to increase performance and scala-
bility.
7 CONCLUSIONS
This work successfully achieved its objective of par-
allelizing the generation of formal concepts by parti-
tioning the formal context. We achieved this by devel-
oping a mathematical formalization and experimenta-
tion, improving speedup, e.g., of up to 1.4116x with
an efficiency of 70.48% for 100,000 objects and 50
attributes with a density of 50%.
However, this work only addressed the division of
formal contexts by groups of objects, requiring an-
other analysis and adaptation to support the division
by attributes.
For future work, we intend to execute more tests
in environments with different architectures and con-
figurations, such as measuring whether there was a
speedup for different densities, attributes, and objects.
We also consider evaluating energy consumption and
the usage of heterogeneous architectures and other re-
sources of parallelization.
Parallelism in the Generation of Concepts Through the Formal Context Object Partitioning Using the In-Close 4 Algorithm
201
ACKNOWLEDGEMENTS
The present work was carried out with the support of
the Coordenac¸
˜
ao de Aperfeic¸oamento de Pessoal de
N
´
ıvel Superior - Brazil (CAPES) - Financing Code
001 and Fundac¸
˜
ao de Amparo
`
a Pesquisa do Estado
de Minas Gerais (FAPEMIG) - APQ-01929-22. The
authors also thank CNPq and PUC Minas for their
partial support in the execution of this work.
REFERENCES
Andrews, S. (2009). In-close, a fast algorithm for com-
puting formal concepts. International Conference on
Conceptual Structures.
Andrews, S. and Orphanides, C. (2010). Analysis of large
data sets using formal concept lattices. 7th Interna-
tional Conference on Concept Lattices and Their Ap-
plications, pages 104–115.
Carpineto, C. and Romano, G. (2004). Concept data anal-
ysis: Theory and applications. John Wiley & Sons.
Cerf, L., Besson, J., Robardet, C., and Boulicaut, J.-F.
(2008). Data-peeler: Constraint-based closed pattern
mining in n-ary relations. In proceedings of the 2008
SIAM International conference on Data Mining, pages
37–48. SIAM.
Chunduri, R. K. and Cherukuri, A. K. (2019). Scalable for-
mal concept analysis algorithms for large datasets us-
ing spark. Journal of Ambient Intelligence and Hu-
manized Computing, 10(11):4283–4303.
de Moraes, N. R., Dias, S. M., Freitas, H. C., and Z
´
arate,
L. E. (2016). Parallelization of the nextclosure algo-
rithm for generating the minimum set of implication
rules. Artif. Intell. Research, 5(2):40–54.
Domingos-Silva, J. P. and Vieira, N. J. (2008). A classifi-
cation algorithm based on concept similarity. In Re-
search and Development in Intelligent Systems XXIV:
Proceedings of AI-2007, the Twenty-seventh SGAI In-
ternational Conference on Innovative Techniques and
Applications of Artificial Intelligence, pages 281–291.
Springer.
Fu, H. and Nguifo, E. M. (2004). A parallel algorithm to
generate formal concepts for large data. In Concept
Lattices: Second International Conference on Formal
Concept Analysis, ICFCA 2004, Sydney, Australia,
February 23-26, 2004. Proceedings 2, pages 394–401.
Springer.
Ganter, B., Stumme, G., and Wille, R. (2005). Formal con-
cept analysis: foundations and applications, volume
3626. Springer.
Ganter, B. and Wille, R. (2012). Formal concept analysis:
mathematical foundations. Springer Science & Busi-
ness Media.
Godinho, J., Gomide Gomes, J., Malheiro, R., and San-
tana, L. (2022). Hydrological forecast in maca
´
e river
basin with neural networks. Revista Brasileira de
Computac¸
˜
ao Aplicada, 14(1):70–80.
Jindal, R., Seeja, K., and Jain, S. (2020). Construction of
domain ontology utilizing formal concept analysis and
social media analytics. International Journal of Cog-
nitive Computing in Engineering, 1:62–69.
Kodagoda, N., Andrews, S., and Pulasinghe, K. (2017). A
parallel version of the in-close algorithm. In 2017 6th
National Conference on Technology and Management
(NCTM), pages 1–5. IEEE.
Linden, R. (2008). Algoritmos gen
´
eticos (2a edic¸ao). Bras-
port.
Mattson, T. G., He, Y., and Koniges, A. E. (2019). The
OpenMP Common Core Making OpenMP Simple
Again. The MIT Press.
Novais, J. P., Maciel, L. A., Souza, M. A., Song, M. A., and
Freitas, H. C. (2021). An open computing language-
based parallel brute force algorithm for formal con-
cept analysis on heterogeneous architectures. Con-
currency and Computation: Practice and Experience,
page e6220.
Pacheco, P. (2011). An introduction to parallel program-
ming. Elsevier.
Priss, U. (2006). Formal concept analysis in information
science. Annual review of information science and
technology, 40(1):521–543.
Quinn, M. J. (2003). Parallel programming. TMH CSE,
526:105.
Rimsa, A., Song, M. A., and Z
´
arate, L. E. (2013). Scgaz-a
synthetic formal context generator with density con-
trol for test and evaluation of fca algorithms. In 2013
IEEE International Conference on Systems, Man, and
Cybernetics, pages 3464–3470. IEEE.
Robey, R. and Zamora, Y. (2021). Parallel and high perfor-
mance computing. Simon and Schuster.
Van Der Pas, R., Stotzer, E., and Terboven, C. (2017). Using
OpenMP-The Next Step: Affinity, Accelerators, Task-
ing, and SIMD. The MIT Press.
Zhou, J., Yang, S., Wang, X., and Liu, W. (2021). A new
algorithm based on extent bit-array for computing for-
mal concepts. arXiv preprint arXiv:2111.00003.
ICEIS 2023 - 25th International Conference on Enterprise Information Systems
202