Temporal Analysis of Brazilian Presidential Election on Twitter Based on

Formal Concept Analysis

Daniel Pereira

, Julio Neves

, Wladmir Brand

and Mark Song

Universidade Cat

olica de Minas Gerais, Computer Science Department, Minas Gerais, Brazil

Keywords:

Topic Evolution, Twitter, Formal Concept Analysis, Social Network Analysis.

Abstract:

Social networks have become an environment where users express their feelings and share news in real-time.

However, analyzing the content produced by users is not a simple task, given the volume of posts. It is

important to comprehend the expressions made by users to gain insights into politicians, public ﬁgures, and

news. The state-of-the-art lacks studies that propose how the topics discussed by social network users change

over time. In this context, this work measures how topics discussed on Twitter vary over time. Formal Concept

Analysis was used to measure how these topics were varying, considering the support and conﬁdence metrics.

Our solution was tested on tweets related to the Brazilian presidential election. The results conﬁrm that it

is possible to comprehend what Twitter users were discussing and how these topics changed over time. Our

work is beneﬁcial for politicians seeking to analyze the discussions about them among users. Our analysis of

3,634 tweets revealed several signiﬁcant patterns, such as the association between political ﬁgures and topics

like fake news and election fraud. These ﬁndings demonstrate how social media discussions evolve during key

political events, providing insights that can assist political campaigns in real-time.

1 INTRODUCTION

The Internet is no longer just a repository for docu-

ments to be shared, it is now a hybrid space for differ-

ent media and applications that reach a large audience

(Zhang et al., 2012). Some of these applications are

social networks, which allow their users to generate a

large amount of content that exempliﬁes their impres-

sions and experiences. A speciﬁc social network that

stands out for forcing its users to express themselves

concisely is Twitter. On Twitter, users express them-

selves through tweets, which consist of text content

with a maximum length of 280 characters.

The fact that the Tweet is a short textual model

allows users to quickly report what they are experi-

encing at the time the post is posted, unlike a jour-

nalist, for example, who, to generate a story, needs

to ensure its excellence. Since Twitter users report

their experiences without worrying about their writ-

ing or who will read their text, Twitter is probably

the fastest means of disseminating information in the

https://orcid.org/0009-0005-2276-2334

https://orcid.org/0000-0002-0520-9976

https://orcid.org/0000-0002-1523-1616

https://orcid.org/0000-0001-5053-5490

world (Cataldi et al., 2010).

With this large amount of information provided, it

is hard to extract knowledge from a group of tweets.

This task is relevant for politicians, for example, to

check the opinions that users are expressing about

them. Therefore, it would be relevant to develop a tool

that is capable of analyzing and extracting knowledge

from a group of tweets.

An alternative to solve this challenge is through

Natural Language Processing (NLP) and Formal Con-

cept Analysis (FCA). The objective of our work is

to use NLP to ﬁnd recurrent groups of words from

tweets and then analyze how these groups of words

relate to each other. The relation between these terms

is measured using FCA, using the metrics of support

and conﬁdence. Also, how these terms change over

time is another metric analyzed.

While social media platforms like Twitter offer

rich datasets for sentiment analysis, existing methods

often fall short in tracking topic evolution over time.

Traditional NLP techniques primarily focus on static

analysis, which limits their ability to capture the dy-

namic nature of online discussions. This work aims

to address these gaps by employing Formal Concept

Analysis FCA to observe how political topics evolve

during critical events, such as elections.

Pereira, D., Neves, J., Brandão, W. and Song, M.

Temporal Analysis of Brazilian Presidential Election on Twitter Based on Formal Concept Analysis.

DOI: 10.5220/0012869300003825

In Proceedings of the 20th International Conference on Web Information Systems and Technologies (WEBIST 2024), pages 167-174

ISBN: 978-989-758-718-4; ISSN: 2184-3252

167

A case study to solve the problem was used. The

case study consists of analyzing tweets that discuss

the Brazilian presidential election, checking which

terms are related to each candidate and how they

evolve over time.

The remainder of the paper is organized as fol-

lows: the background is outlined in Section 2. The

Literature Review is described in Section 3. Section

4 presents the deﬁned Methodology. Results are dis-

cussed in Section 5. The conclusion and further re-

search are in Section 6.

2 BACKGROUND

2.1 Formal Concept Analysis

FCA is a technique based on formalizing the notion

of concept and structuring concepts in a conceptual

hierarchy. FCA relies on lattice theory to structure

formal concepts and enable data analysis. The ca-

pability to hierarchize concepts extracted from data

makes FCA an interesting tool for dependency anal-

ysis. With the increase of social networks and due to

the large amount of data generated by users, the study

and improvement of techniques to extract knowledge

are becoming increasingly justiﬁed. Also, it permits

the data analysis through associations and dependen-

cies attributes, and objects, formally described, from

a dataset.

Formally, a formal context is formed by a triple

(G, M, I), where G is a set of objects (rows), M is a set

of attributes (columns) and I is deﬁned as the binary

relationship (incidence relation) between objects and

their attributes where I ⊆ G × M.

Table 1 exempliﬁes a formal context. In this ex-

ample, objects correspond to tweets, attributes are the

characteristics (terms), and the relationship of inci-

dence represents whether or not the tweet has that

characteristic. An

′

is present in the table if the

tweet possesses the corresponding characteristic.

Table 1: Formal Context Example.

Lula Bolsonaro Fake News Elections

Tweet 1 X

Tweet 2 X X

Tweet 3 X X X

Tweet 4 X

2.2 Formal Concepts

Let (G, M, I) be a formal context, A ⊆ G a subset

of objects and B ⊆ M a subset of attributes. Formal

concepts are deﬁned by a pair (A, B) where A ⊆ G is

called extension and B ⊆ M is called intention. This

pair must follow the conditions where A = B

′

and

B = A

′

(Ganter and Wille, 1999). The relation is de-

ﬁned by the derivation operator (

′

= { m ∈ M| ∀ g ∈ A, (g, m) ∈ I}

′

= { g ∈ G| ∀ m ∈ B, (g, m) ∈ I}

If A ⊆ G, then A

′

is a set of attributes common to

the objects of A. The derivation operator (

′

) can be

reapplied in A

′

resulting in a set of objects again (A

′′

Intuitively, A

′′

returns the set of all objects that have

in common the attributes of A

′

; note that A ⊆ A

′′

. The

operator is similarly deﬁned for the attribute set. If

B ⊆ M, then B

′

returns the set of objects that have

the attributes of B in common. Thus, B

′′

returns the

set of attributes common to all objects that have the

attributes of B in common; consequently, B ⊆ B

′′

As an example, using Table 1, objects

A = {Tweet2, Tweet3}, when submitted

to the operator described above, will re-

sult in A

′

= {Bolsonaro, Elections}. So

{{Tweet2, Tweet3}, {Bolsonaro, Elections}} is

a concept. All concepts found from Table 1 are

displayed in Table 2.

Table 2: Existing concepts in the formal context of Table 1.

Objects Attributes

{Tweet 1, Tweet 2,

Tweet 3, Tweet 4}

{}

{Tweet 4} {Fake News}

{Tweet 1, Tweet 3} {Lula}

{Tweet 2, Tweet 3} {Bolsonaro, Elections}

{} {Lula, Bolsonaro,

Fake News, Elections}

In Table 2 there is a concept with an empty at-

tribute set and a concept with an empty object set.

They are called inﬁmum and supremum, respectively.

2.3 Triadic Concept Analysis

Initially, Triadic Concept Analysis (TCA) was de-

ﬁned by Lehmann and Wille (Lehmann and Wille,

1995) which extends FCA, but a new dimension was

added (Wille, 1995).

Formally, a triadic context is given by the quadru-

ple (K

,Y ), where K

, K

and K

is deﬁned by

the sets and Y the relation of the K

, K

and K

, i.e.,

Y ⊆ K

x K

, the elements of K

, K

, and K

are called (formal) objects, attributes, and conditions,

respectively, and (g, m, b) ∈ Y is read: the object g has

the attribute m under the condition b. An example of a

triadic context is represented in Table 3. This example

shows the dataset with 3 dimensions: Users, Months

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

168

and Terms. We have the Users/ID {1,2,3,4} as ob-

jects, Months {July, August} as attributes and Terms

{Lula,Bolsonaro,Fake News,Elections} as conditions.

Implications are dependencies between elements

of a set obtained from a formal context. Given the

context (G, M, I) the rules of implication are of the

form B → C if and only if B, C ⊂ M and B

′

⊂ C

′

(Gan-

ter et al., 2005). An implication rule B → C is con-

sidered valid if and only if every object that has the

attributes of B will also have the attributes of C.

We can deﬁne rules, as follows: r : A → B(s, c),

where A, B ⊆ M and A ∩ B = ∅. We can also de-

ﬁne the support of the rules, which is deﬁned by

s = supp(r) =

′

∩B

′

|G|

and the conﬁdence of the rules,

which is deﬁned by c = con f (r) =

′

∩B

′

(Agrawal

and Srikant, 1994).

Table 4 shows two existing rules in the context of

Table 1. The rule Bolsonaro → Elections has 50%

support because this rule happens in 2 tweets, out of

a total of 4 tweets. The conﬁdence is 100%, since

whenever a tweet has Bolsonaro it also has Elections.

When a rule has 100% conﬁdence, such as the rule

Bolsonaro → Elections, it is called an implication.

2.4 Database Processing

Textual databases need to be pre-processed before be-

ing analyzed. The steps performed in this work are

the following: N-Gram, stop word removal, and Reg-

ular Expression.

• N-Gram: is a contiguous sequence of n items from

a given sample of text. The items can be letters or

words that are in sequence on a text sample;

• Stop Word Removal: consists of removing words

such as articles and prepositions, as these words

are not signiﬁcant for textual analysis;

• Regular Expression: a technique to determine a

pattern in a text sample. It is used to ﬁnd a group

of words that need to be replaced or deleted.

The steps described above were applied through

the Python package Natural Language Toolkit

(NLTK). The NLTK package has a list of stop words,

such as “the”, “a”, “an”, and “in”, so those words in

the list are removed from the database being prepro-

cessed, as these words are not meaningful to the anal-

ysis. This allows the database after pre-processing

to have a reduced size and also reduces the analysis

time (Contreras et al., 2018).

An n-gram is a contiguous sequence of n items

from a given sample of text. The items can be letters

or words that are in sequence in a text sample. An

n-gram of size 1 is referred to as a unigram and does

not consider other words that are in sequence. Size

2 is a bigram, and size 3 is a trigram, meaning that a

group of three words are in sequence in a text sample

(Roark et al., 2007). Table 5 shows an example of

bigrams and trigrams found in a text sample.

3 LITERATURE REVIEW

Several works are relevant to the context of this study.

These include works related to topic detection in so-

cial networks, topic evolution, and the classiﬁcation

of textual content. These works will be described in

the following paragraphs.

Zhang et al. (Zhang et al., 2012) detail how the de-

tection of topics on the Internet is a challenge because

the information produced on the Internet is succinct

and does not adequately describe the real context be-

ing addressed. To solve this characteristic of the infor-

mation produced on the Internet, the authors used the

technique of pseudo-relevance feedback, which con-

sists of adding information to the data being analyzed.

With this strategy, the authors were able to en-

hance the quality of information available on the In-

ternet, reﬁning the context with which this informa-

tion is associated. Consequently, they identiﬁed the

trends within this information that are likely to be-

come more prevalent on the Internet in the future.

This research also aims to detect topics within the

content produced on Twitter. However, the pseudo-

relevance feedback technique was not applied, as the

authors speciﬁcally analyzed tweets related to one

context—the Brazilian presidential election.

Cataldi et al. (Cataldi et al., 2010) used the topic

detection technique to identify emerging topics in the

Twitter community. The authors were able to carry

out the identiﬁcation considering that if the topic oc-

curs frequently in the present and was rare in the

past, and thus characterized them as emerging. To

enhance the strategy addressed, an analysis of the au-

thors of these emerging topics was carried out through

the Page Rank algorithm, to ensure that the emerging

topic is not present only in some bubble of the Twitter

community. Finally, a graph was created that con-

nects the emerging topic with other topics that are re-

lated to it, and that therefore have a greater chance of

becoming emerging topics as well. Unlike the work

described above, this research aims to use topic detec-

tion to analyze how these topics change over time.

Our results align with ﬁndings from previous stud-

ies, such as Cataldi et al. (Cataldi et al., 2010) on topic

detection in social media, but provide a unique con-

tribution by focusing on temporal evolution. In con-

trast to studies like Zhang et al. (Zhang et al., 2012),

Temporal Analysis of Brazilian Presidential Election on Twitter Based on Formal Concept Analysis

169

Table 3: Triadic formal context sample.

July August

ID Lula Bolsonaro Fake News Elections Lula Bolsonaro Fake News Elections

1 X X X

2 X X X

3 X X

4 X X X X

Table 4: Example of supported and trusted rules.

Rule Support Conﬁdence

Bolsonaro → Elections 50% 100%

Lula →

Bolsonaro, Elections

25% 50%

which used pseudo-relevance feedback, our approach

focused on the context of the Brazilian presidential

election, providing more targeted insights.

Dragos¸ et al. (SM. et al., 2017) present an ap-

proach that investigates the behavior of users of a

learning platform using FCA. The log generated by

the platform contains information about the actions

that each student performs on the platform. Thus, the

log allows the identiﬁcation of the proﬁles of students.

The use of FCA by Dragos¸ et al. occurs to con-

sider the instant of time that the actions are performed

by the students. It is relevant to proﬁle students to

understand whether they are performing actions late,

early or on time. Therefore, FCA can be considered

as an alternative to study temporal events.

Cigarr

an et al. (Cigarr

an et al., 2016) used FCA

to group tweets according to the topics found. By us-

ing FCA, the work still manages to obtain a concep-

tual grid of the topics found, obtaining a hierarchi-

cal view of the topics, which is a differential to other

techniques. The proposal was among the best results

of the RepLab 2013 forum, proving the effectiveness

of FCA for the topic detection challenge.

Arca et al. (Arca et al., 2020) propose an approach

to suggest tags (meaningful human-friendly words)

for videos that consider hot-trend subjects, ensuring

the video receives more visibility by being related to

a trending subject. The original tags are inserted man-

ually, and these tags serve as input for the algorithm,

which matches them with hot trend subjects. Our pro-

posed method also identiﬁes meaningful words, the

difference is that our input is tweets, and then ana-

lyzes how these words vary over time.

4 METHODOLOGY

This section presents our methodology to achieve the

proposed objectives. For this, the steps presented in

the sections below were performed.

4.1 Creation of the Dataset of Tweets

About the Presidential Election

To create the dataset, a Python script was used to run

daily and collect the most relevant tweets of the day,

using the ﬁlter provided by the Twitter API. This ﬁlter

ensures that only tweets with signiﬁcant reach within

the social network are returned. The script collected

3,634 tweets during the period from July 23, 2022, to

September 8, 2022. The extended collection period

was a differentiator in achieving better results.

The dataset was created using the Twitter API

with ﬁlters applied to retrieve tweets in Portuguese,

speciﬁcally from Brazilian users. Only tweets con-

taining keywords Lula, Bolsonaro, and Elections

were considered. Additionally, the collection focused

on tweets with signiﬁcant reach, such as those with a

minimum of 100 retweets or likes.

The challenges encountered during this process

include limitations imposed by the Twitter API, re-

stricting the number of requests that can be made

by each user. Currently, these limitations are even

greater, as it is not possible to use the Twitter API for

free, which hinders the reproducibility of this work.

After that the preprocessing of the database was

necessary, transforming the textual content of tweets

into a list of words that will be analyzed. To perform

the task, the techniques described in the Database Pro-

cessing section were used. Therefore, the ﬁnal result

consists of a database that includes a tweet identiﬁer,

the topic extracted from the tweet, and the timestamp

indicating when the tweet was published.

4.2 Reducing Formal Context

After the preprocessing stage, it is necessary to reduce

the formal context to eliminate topics that are not sig-

niﬁcant because they are associated with only a few

tweets. Thus, in addition to removing attributes from

the formal context, this stage helps reduce the number

of concepts found, facilitating the ﬁnal analysis of the

work, which is the validation of the concepts found.

To select topics that are present in the largest pos-

sible number of tweets, a Python script was utilized.

This script counted how many tweets each term was

present in and sorted these topics, listing ﬁrst those

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

170

Table 5: Example of bigram and trigram.

Text Sample Bigram Trigram

Topics change over time {Topics change}

{change over}

{over time}

{Topics change over}

{change over time}

with the highest number of appearances. The top-

ics with the highest number of appearances in tweets

were then chosen to compose the formal context.

4.3 Lattice Miner

With the topics already deﬁned, it is necessary to pre-

pare the ﬁles that will be used as input for the Lat-

tice Miner tool. This tool which was developed at the

University of Qu

ebec (Missaoui and Kwuida, 2011)

will be used for constructing, visualizing, and manip-

ulating contexts. The tool reads ﬁles in JSON format

to generate the formal context. To accomplish this, a

Python script was created to generate these ﬁles in the

expected format for the tool.

Once the formal context is constructed, it becomes

possible to extract implication rules for analysis, thus

producing the expected results. The implication rules

are provided in XML ﬁles and were analyzed manu-

ally, as few rules were generated due to the low sup-

port they showed. In practical terms, a search was

conducted for rules that exemplify events or senti-

ments of users towards a politician.

5 RESULTS

To analyze the presidential election, the tweets were

collected between the period of 07/23/2022 and

09/08/2022. The tweets were obtained through the

Twitter API using the keywords Lula, Bolsonaro, and

Elections. The Twitter API ﬁlter was applied to re-

trieve only popular tweets, which reduced the total

number of tweets. However, these tweets generated

signiﬁcant engagement on the social media platform.

After applying NLP techniques and selecting N-

Grams with meaning, the following attributes were

obtained for the formal concept: Alexandre de

Moraes, good versus evil, elections, Bolsonaro, Lula,

democracy, electoral research, Senator Rog

erio Car-

valho, secret budget, Bolsonaro no Flow, armed

forces, fake news, ﬁrst round, Guilherme de P

adua,

genocidal, former convict, Jornal Nacional, corrup-

tion, vote, president, interview, cash, purchased prop-

erties, nursing salary ﬂoor, Fachin, and suspends arms

decree.

The condition of the formal context created is the

period in which the tweet was published. The period

used was 4 days, resulting in a total of 12 conditions.

As an example, Table 6 displays a sample of the gen-

erated formal context.

With the obtained formal context, it was possi-

ble to generate implication rules and analyze how the

generated rules reﬂect the events of the presidential

election in Brazil. Table 7 provides details on the gen-

erated rules.

The ﬁrst implication rule to be discussed in this

study is composed of the terms “Elections” and “Fake

news”. This rule was observed during a crucial pe-

riod that coincided with the announcement of poten-

tial penalties for candidates spreading fake news dur-

ing the electoral period, between 08/08 and 08/15.

During this time, there was also a signiﬁcant increase

in the scrutiny of the effectiveness of social media

in detecting and removing fake news, aiming to pre-

vent users from being misinformed. It is believed that

these events were reﬂected in Twitter discussions, jus-

tifying their identiﬁcation in our study.

Furthermore, it is important to highlight that the

spread of fake news can have serious consequences

for democracy, such as interference in the electoral

process and manipulation of public opinion. For this

reason, the ﬁght against fake news has become a

global concern, and the analysis of such data can con-

tribute to the understanding and combat of this grow-

ing phenomenon.

Analyzing the conﬁdence metric among rules re-

sulting in fake news with the antecedents “Elections”

and “Bolsonaro”, it is possible to observe an interest-

ing trend. It is noted that the rule with the antecedent

“Bolsonaro” has higher conﬁdence compared to the

rule with the antecedent “Elections”. This difference

in conﬁdence suggests that the name of the presiden-

tial candidate, Jair Bolsonaro, was more strongly as-

sociated with fake news than the electoral context it-

self. This highlights that social media users had a dis-

tinctive perception that candidate Bolsonaro was in-

volved in the dissemination of misleading informa-

tion. The association between the name of a political

candidate and the spread of fake news may have inﬂu-

enced public perception and sparked heated debates

and discussions on social media during the election

period.

In Table 7, the second identiﬁed implication rule

Temporal Analysis of Brazilian Presidential Election on Twitter Based on Formal Concept Analysis

171

Table 6: Formal context sample.

07-23-2022 - 07-26-2022 07-27-2022 - 07-30-2022

ID Lula Bolsonaro Genocidal Former convict Lula Bolsonaro Genocidal Former convict

1 X X

2 X X

3 X X

4 X X

Table 7: Implication rules that varied over time.

Time Period Antecedent Consequence Support Conﬁdence

08-08-2022 to 08-11-2022 Elections Fake news 4% 13%

08-12-2022 to 08-15-2022 Elections Fake news 2% 16%

08-08-2022 to 08-11-2022 Lula First round 2% 16%

08-16-2022 to 08-19-2022 Lula First round 8% 26%

08-16-2022 to 08-19-2022 Bolsonaro Fake news 4% 33%

09-01-2022 to 09-04-2022 Bolsonaro Fake news 3% 22%

08-20-2022 to 08-23-2022 Lula Former convict 4% 25%

08-28-2022 to 08-31-2022 Lula Former convict 2% 14%

08-24-2022 to 08-27-2022 Lula Interview 2% 25%

08-28-2022 to 08-31-2022 Lula Interview 2% 7%

08-28-2022 to 08-31-2022 Bolsonaro

Cash,

Purchased properties

2% 10%

09-01-2022 to 09-04-2022 Bolsonaro

Cash,

Purchased properties

3% 22%

is composed of the terms “Lula” and “First round”.

Temporal analysis reveals that this rule occurred in

two distinct phases: the ﬁrst between 08/08 and 08/11

and the second between 08/16 and 08/19. At the time,

the possibility of Lula winning the election in the ﬁrst

round was a recurring topic in the media, as electoral

polls indicated that he was close to reaching 50% of

valid votes.

It is interesting to note how the support for this

rule increased signiﬁcantly, going from 2% to 8%,

reﬂecting the growing support for Lula’s candidacy

and the public’s interest in this electoral scenario. It’s

worth highlighting that the increase in support is di-

rectly related to the evolution of electoral polls, which

showed the increasing preference of voters for the for-

mer president. On August 18th, a poll indicated that

Lula reached 51% of valid votes, further fueling the

discussion about the possibility of his victory in the

ﬁrst round.

Therefore, it is possible to observe how the sup-

port metric can be useful for evaluating the evolu-

tion of discussions on the social network, and tracking

the rise of topics that become increasingly relevant to

users. This analysis is crucial for understanding the

impact of public opinion on elections and the con-

struction of candidates’ images.

The third identiﬁed implication rule in Table 7 is

composed of the terms “Bolsonaro” and “Fake news”.

It is interesting to note that this rule appeared in dis-

persed periods, ﬁrst between 08/16 and 08/19 and

then between 09/01 and 09/04. The ﬂuctuation in

the occurrence frequency of this rule can be explained

by two relevant events related to Bolsonaro and fake

news.

The ﬁrst event occurred in mid-August when news

broke that a group of businessmen was allegedly or-

chestrating a coup in favor of President Bolsonaro.

This event generated signiﬁcant media and social me-

dia attention, with Bolsonaro dismissing the news as

fake. This could have inﬂuenced the occurrence of

this rule in the ﬁrst analyzed phase.

The second event that may have driven the oc-

currence of this rule took place in early September

when the Superior Electoral Court (TSE) ﬁned Presi-

dent Bolsonaro for spreading fake news linking Lula

to the First Capital Command (PCC). This false news

circulated widely on social media, sparking debates

and discussions about the spread of misinformation

and its impact on elections. This event could have

contributed to the occurrence of the rule in the second

analyzed phase.

This implication rule illustrates how a particular

theme can emerge on the social network, be discussed

for a period, and then temporarily disappear, only to

resurface at another time. This highlights the volatil-

ity of social media and the dynamic nature of dis-

cussions that take place within them. Understanding

these dynamics is crucial for the analysis of public

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

172

opinion and electoral campaign strategies.

The fourth identiﬁed implication rule in Table 7

is particularly interesting as it reveals the strong po-

litical polarization in Brazil. This rule is composed

of the terms “Lula” and “Former convict” and was

observed shortly after a televised debate among presi-

dential candidates in which Jair Bolsonaro referred to

candidate Lula as a “former convict”.

This act sparked a heated discussion on social me-

dia about whether Lula could be called innocent af-

ter events that discredited Operation Car Wash, the

largest corruption investigation in the country’s his-

tory. While Bolsonaro’s supporters applauded the

provocation, Lula’s supporters reacted with indigna-

tion, accusing the current president of attempting to

tarnish the image of the former president and leader

of the Workers’ Party.

The implication rule “Lula” and “Former convict”

illustrates how social media can be used as tools for

the propagation of political discourse and how polit-

ical disagreements can lead to polarized and heated

discussions on the internet.

It is interesting to note that the fourth identi-

ﬁed implication rule in Table 7 showed a conﬁdence

of 25% between 20 and 23 August, making it the

second-highest conﬁdence found in the analysis. This

highlights the strong association between the name

of the candidate Lula and the term “former convict”,

suggesting that Jair Bolsonaro’s strategy was effective

in linking Lula to his past as a convicted individual in

the eyes of the law.

This result indicates how political rhetoric can in-

ﬂuence discussions on social media and how polariza-

tion can lead to the widespread dissemination of bi-

ased information. Furthermore, this implication rule

emphasizes the importance of sentiment and opinion

analysis on social media to understand how political

rhetoric can inﬂuence public opinion and shape politi-

cal discourse around certain topics and public ﬁgures.

The ﬁfth identiﬁed implication rule in Table 7 is

composed of the terms “Lula” and “Interview”. The

latter term refers to the interview conducted by Jor-

nal Nacional of TV Globo with all presidential can-

didates. It is interesting to note that this interview

took place on August 25, and yet the subject contin-

ued to be discussed on Twitter until August 31. This

fact demonstrates the relevance of the interview for

platform users, who engaged in discussions about the

candidates’ performance in that event.

In the speciﬁc case of the term “Lula”, it can be

inferred that the candidate’s participation in the inter-

view generated even greater interest among users who

discussed his performance and the ideas presented.

This highlights the importance of traditional media

in shaping public opinion and how social media plat-

forms can amplify and prolong the impact of these

events in society. Additionally, the persistence of the

discussion about the interview on Twitter also demon-

strates how social media can be used to monitor and

analyze public engagement regarding signiﬁcant po-

litical events.

It is notable how the conﬁdence metric of this spe-

ciﬁc rule signiﬁcantly decreased in the second ana-

lyzed period. This drop in conﬁdence indicates that,

after a few days from the mentioned interview, new

terms and topics related to candidate Lula started to

gain relevance in the discussion. This highlights the

volatility of the social network, where the discussed

themes and topics can change rapidly within a short

period.

This agile dynamic of the social network reﬂects

the ephemeral nature of conversations and the rapid

dissemination of information. In a matter of days,

other events, statements, or more recent occurrences

can capture the users’ attention and shift the focus to

more recent topics. This observation reinforces the

importance of continuously monitoring discussions

on social media to gain an updated understanding of

the landscape of opinions and topics of interest.

The sixth identiﬁed implication rule in Table 7 is

very relevant for understanding the Brazilian politi-

cal context. It is composed of the terms “Bolsonaro”,

“Cash”, and “Purchased properties” and is related to

news articles published by the Brazilian media that

highlighted many of the properties purchased by the

Bolsonaro family were acquired in cash.

This fact raised questions from Jair Bolsonaro’s

opponents, who sought to understand the reason be-

hind the cash purchases and the origin of this money.

The implication of these terms shows how issues re-

lated to ethics in politics are relevant to Twitter users,

who aim to discuss and comprehend the implications

of such actions by politicians.

It is noticeable how the analysis of topics dis-

cussed on Twitter can provide valuable information

for presidential campaigns. By understanding which

topics are generating more interest and discussion

among social media users, campaigns can adjust their

communication strategies and act more effectively to

win over the electorate. Additionally, the analysis

can also help campaigns identify weaknesses in their

strategies and improve them.

In summary, the analysis of topics on Twitter can

be an important tool for electoral campaigns to con-

nect with the electorate and better understand their

concerns and interests.

Temporal Analysis of Brazilian Presidential Election on Twitter Based on Formal Concept Analysis

173

6 CONCLUSIONS

This paper presents an approach to analyze topics

discussed on the Twitter social network. To iden-

tify these topics, the Twitter API is used to extract

tweets, and NLP is applied to ﬁnd topics within the

tweets. Finally, FCA was utilized to generate impli-

cation rules between these topics, providing metrics

such as support and conﬁdence.

With the generated implication rules and metrics,

it was possible to compare how Twitter users were

affected by events occurring at the time, such as the

Brazilian presidential election.

The Brazilian presidential election was evaluated

with a focus on the two most relevant candidates, Lula

and Bolsonaro. The analysis was conducted over a

month and a half, allowing topics that were being dis-

cussed to be forgotten and then remembered by users

of the social network, as was the case with the topics

“Bolsonaro” and “Fake news”.

The results demonstrate that it is possible to as-

sess how users are reacting to events related to the

election, information that is relevant for presidential

campaigns. Campaigns need metrics to evaluate the

impact of their actions.

As future work, there is a plan to automate the

entire process so that data can be extracted from the

social network and analyzed automatically, generat-

ing metrics on the topics discussed at the same mo-

ment. This is relevant for analyzing this informa-

tion at the same time it is being discussed on the

social network. Also, could extend this analysis by

incorporating sentiment analysis to better understand

the emotional tone of discussions. While our current

study focuses on topic evolution, adding a sentiment

dimension could reveal more nuanced insights into

public opinion. Future work could focus on automat-

ing the rule analysis process using machine learning

techniques, which would allow for a more extensive

and unbiased evaluation of the generated rules and

improve the overall efﬁciency of the method.

ACKNOWLEDGEMENTS

The authors thank the Pontif

ıcia Universidade

Cat

olica de Minas Gerais – PUC-Minas and

Coordenac¸

ao de Aperfeic¸oamento de Pessoal de

ıvel Superior — CAPES (CAPES – Grant PROAP

88887.842889/2023-00 – PUC/MG, Grant PDPG

88887.708960/2022-00 – PUC/MG - Inform

atica,

and Finance Code 001). The present work was also

carried out with the support of Fundac¸

ao de Amparo

a Pesquisa do Estado de Minas Gerais (FAPEMIG)

under grant number APQ-01929-22.

REFERENCES

Agrawal, R. and Srikant, R. (1994). Fast algorithms for

mining association rules in large databases. In Pro-

ceedings of the 20th International Conference on Very

Large Data Bases, VLDB ’94, pages 487–499, San

Francisco, CA, USA. Morgan Kaufmann Publishers

Inc.

Arca, A., Carta, S., Giuliani, A., Stanciu, M., and Refor-

giato Recupero, D. (2020). Automated tag enrichment

by semantically related trends. In Automated Tag En-

richment by Semantically Related Trends, pages 183–

193.

Cataldi, M., Di Caro, L., and Schifanella, C. (2010). Emerg-

ing topic detection on twitter based on temporal and

social terms evaluation. In Proceedings of the Tenth

International Workshop on Multimedia Data Mining,

MDMKDD ’10, New York, NY, USA. Association for

Computing Machinery.

Cigarr

an, J.,

Angel Castellanos, and Garc

ıa-Serrano, A.

(2016). A step forward for topic detection in twitter:

An fca-based approach. Expert Systems with Applica-

tions, 57:21–36.

Contreras, J. O., Hilles, S., and Abubakar, Z. B. (2018).

Automated essay scoring with ontology based on text

mining and nltk tools. In 2018 International Confer-

ence on Smart Computing and Electronic Enterprise

(ICSCEE), pages 1–6.

Ganter, B., Stumme, G., and Wille, R., editors (2005). For-

mal Concept Analysis: Foundations and Applications.

Springer.

Ganter, B. and Wille, R. (1999). Formal concept analy-

sis: mathematical foundations. Springer, Berlin; New

York.

Lehmann, F. and Wille, R. (1995). A triadic approach to

formal concept analysis. Conceptual structures: ap-

plications, implementation and theory, pages 32–43.

Missaoui, R. and Kwuida, L. (2011). Mining triadic asso-

ciation rules from ternary relations. Formal Concept

Analysis, pages 204–218.

Roark, B., Saraclar, M., and Collins, M. (2007). Discrimi-

native n-gram language modeling. Computer Speech

Language, 21(2):373–392.

SM., D., C., S., and S¸otropa DF. (2017). An investigation of

user behavior in educational platforms using temporal

concept analysis. In ICFCA 2017.

Wille, R. (1995). The basic theorem of triadic concept anal-

ysis. Order, pages 149–158.

Zhang, J., Liu, D., Ong, K.-L., Li, Z., and Li, M. (2012).

Detecting topic labels for tweets by matching features

from pseudo-relevance feedback. In Proceedings of

the Tenth Australasian Data Mining Conference - Vol-

ume 134, AusDM ’12, page 9–19, AUS. Australian

Computer Society, Inc.

WEBIST 2024 - 20th International Conference on Web Information Systems and Technologies

174