Stance Detection in Twitter Conversations Using Reply Support

Classiﬁcation

Parul Khandelwal

1 a

, Preety Singh

1 b

, Rajbir Kaur

1 c

and Roshni Chakraborty

2 d

The LNM Institute of Information Technology, Jaipur, India

ABV IIITM, Gwalior, India

Keywords:

Stance, Twitter, Reply Support Label, LSTM, BERT.

Abstract:

During crisis, social media platforms like Twitter play a crucial role in disseminating information and offering

emotional support. Understanding the conversations among people is essential for evaluating the overall im-

pact of the crisis on the public. In this paper, we focus on classifying the replies to tweets during the “Fall of

Kabul” event into three classes: supporting, unbiased, and opposing. To achieve this goal, we proposed two

frameworks. We used LSTM layers for sentence/word-level feature extraction for classiﬁcation. We also em-

ployed a BERT-based approach where the text of both the tweet and the reply is concatenated.Our evaluation

on real-world crisis data showed that the BERT-based architecture outperformed the LSTM models. It pro-

duced an F1-score of 0.726 for the opposing class, 0.738 for the unbiased class, and 0.729 for the supportive

class. These results highlight the robustness of contextualized embeddings in accurately identifying the stance

of replies within Twitter conversations through tweet-reply pairs.

1 INTRODUCTION

Social media platforms, especially Twitter, can be a

helpful tool for disseminating information or to share

emotions in case of any crisis. Tweets often pro-

vide a straight line of communication between the

affected population and concerned authorities(Bukar

et al., 2022). Considerable research has been done

on stance classiﬁcation in events. Stance, has gen-

erally been referred to as public opinion, majorly in

reference to government policies, social movements,

pandemics, etc. (ALDayel and Magdy, 2021). In

most research, attention has been focused on deter-

mining whether a tweet will support, oppose, or main-

tain neutrality toward a certain topic or entity (K

uc¸

and Can, 2020). Stance detection has been done for

political discourses (Lai et al., 2019), health misinfor-

mation (Ng and Carley, 2022), rumors identiﬁcation

(Zheng et al., 2022; Haouari and Elsayed, 2024), cri-

sis scenarios (Zeng et al., 2016), or other social issues.

However, less attention has been paid to the stance

expressed in individual conversation threads. Classi-

https://orcid.org/0000-0003-4963-156X

https://orcid.org/0000-0002-6885-1376

https://orcid.org/0000-0003-1433-1065

https://orcid.org/0000-0003-4476-403X

fying tweet-reply interactions presents a more com-

plex and nuanced understanding of online conversa-

tions. A reply can either be building on, challeng-

ing, or providing a little context to the original tweet

(Zubiaga et al., 2016). This can be indispensable for

judgment regarding public opinion. Tweets provide

the initial viewpoint or information while analyzing

replies can capture the full spectrum of the discourse.

This gap in research can be ﬁlled through stance clas-

siﬁcation in conversation threads to supplement the

dialogue dynamics especially during crisis manage-

ment. The categorization of the stance of replies can

reveal if important information is being ampliﬁed or

contested. This will enable the authorities to evalu-

ate how people are perceiving updates, advisories, or

policies. With this additional knowledge, emergency

responders can modify their communication strate-

gies. This will also help understand the sentimen-

tal conditions of the affected population and gauge

the level of solidarity and aid being shared (Hardalov

et al., 2021).

Reactions to tweets can take forms of support, op-

position, or additional context and can thereby sig-

niﬁcantly affect the dynamics of the virtual discus-

sions. The classiﬁcation of responses can shed more

light on the actual impact of the seed tweets during

crisis situations. This study focuses on automatically

Khandelwal, P., Singh, P., Kaur, R. and Chakraborty, R.

Stance Detection in Twitter Conversations Using Reply Support Classiﬁcation.

DOI: 10.5220/0013129800003905

In Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2025), pages 235-242

ISBN: 978-989-758-730-6; ISSN: 2184-4313

235

classifying tweet-reply pairs to more correctly com-

prehend public engagement and sentiment in conver-

sation threads. Our contributions are summarized be-

low:

• Conversation Stance Classiﬁcation: We pro-

pose classifying tweet-reply conversations, focus-

ing on both their texts, to assess the support ﬂow

of online social media conversations during a cri-

sis.

• Experimentation on Multiple Frameworks:

We present two distinct classiﬁcation frameworks,

utilizing LSTM-based models (Hochreiter, 1997)

and a BERT-based architecture (Devlin, 2018).

Each framework handles tweet-reply embeddings

and sequence processing distinctly.

• Explainable AI: We apply the concept of ex-

plainable AI to interpret the predictions of our

stance classiﬁcation model.

The paper is organized as follows: Section 2 dis-

cusses existing research on tweet classiﬁcation and

reply analysis in crisis. Section 3 outlines the method-

ology, including the dataset and description of the

frameworks. Section 4 provides details on the experi-

mental setup. Section 5 discusses the results and anal-

ysis, comparing the performance of each framework.

Section 6 concludes the paper.

2 RELATED WORK

Stance classiﬁcation has evolved through various

methodologies to analyze public sentiment and dis-

course on social media platforms. (Mohammad et al.,

2016) introduced stance detection by identifying sup-

port, opposition, or neutrality toward speciﬁc targets.

(Zubiaga et al., 2016) extended this work to classify

stance in conversational threads using tree structures,

categorizing replies as supporting, denying, querying,

or commenting. (Mutlu et al., 2020) explored public

perceptions using TF-IDF and CNN-based methods,

while (Villa-Cox et al., 2020) focused on replies and

quotes in controversial Twitter conversations. Simi-

larly, (Hamad et al., 2022) developed the StEduCov

dataset to classify online education stances during the

COVID-19 pandemic.

Recent advancements include multi-task frame-

works like (Abulaish et al., 2023), which employed

Graph Attention Networks (GAT) for jointly predict-

ing stance and rumor veracity, and (Bai et al., 2023),

who proposed the Multi-Task Attention Tree Neural

Network (MATNN) for hierarchical classiﬁcation of

stance and veracity. (Zhang et al., 2024) introduced

DoubleH, leveraging user-tweet bipartite graphs for

stance detection related to the 2020 US presiden-

tial election. While these studies showcase diverse

methodologies, they primarily focus on individual

tweets or structured datasets, overlooking the stance

of replies toward parent tweets in tweet-reply conver-

sations.

3 METHODOLOGY

In this section, we present our methodology to cat-

egorize the support of replies to the parent tweet in

a conversation thread. A high level of support indi-

cates effective communication of the messages. Op-

posing or unbiased replies might suggest misinforma-

tion or confusion among the affected population. Us-

ing such insights can empower crisis managers and

policymakers to make appropriate, real-time adjust-

ments to messaging strategies.

Problem Statement: Consider a dataset D , consist-

ing of n tweet-reply pairs {(T

, R

)}

i=1

, where T

rep-

resents the text of the i-th tweet and R

represents the

corresponding reply text. Each tweet T

has a tweet

label assigned to it.Tweets providing factual data re-

lated to the crisis are labelled as information. Those

reﬂecting personal reactions and appeals for support

are labelled as emotion. Tweets stating personal opin-

ions are termed as neutral. Classifying tweets offers

insights into the nature of the discourse being car-

ried on during the crisis. A tweet can have multiple

replies, forming many tweet-reply pairs. The goal is

to classify the reply support label into one of the fol-

lowing three categories:

• Supporting: The reply demonstrates alignment

with or endorses the parent tweet.

• Unbiased: Reply neither supports nor opposes

the parent tweet.

• Opposing: The reply expresses opposition to the

parent tweet.

Considering the following example:

Tweet Example (tweet label: information): The

US plans to completely pull all personnel from

its embassy in Kabul over the next 72 hours as

Taliban forces close in on Afghanistan’s capital

https://cnn.it/3CRwPXG

• Supporting Reply: ”Bring them home @POTUS

@JoeBiden !!! Thank you !! You are doing good.”

• Unbiased Reply: #BidensSaigon

• Opposing Reply: Sickens me beyond comprehen-

sion. Biden is not only a liar and self serving

crook, he

aC™s a coward. And where is he and

where has he been?? He had all the answers the

ICPRAM 2025 - 14th International Conference on Pattern Recognition Applications and Methods

236

last 4 years and now he disappears and is literally

putting the US in harms way.

There were 123 replies on this tweet at the time of

data collection. Of these, 48 were opposing, 17 were

supporting and 58 were unbiased. Such an analysis

through different conversation threads can provide an

indication on the inclination of the people’s responses

to the developments during the crisis.

Formally, our problem becomes a mapping of the

function f : D → y

∈ {−1, 0, 1} for the classiﬁcation

of stances in tweet-reply pairs:

f (T

, R

) =











1, if R

supports T

0, if R

is neutral towards T

−1, if R

opposes T

The objective is to minimize the classiﬁcation er-

ror by using the sparse categorical cross-entropy loss

function, computed as:

Loss( f ) = −

∑

i=1

· log(p

true

( f (T

, R

)))

where, y

is the true reply support label for the

tweet-reply pair and p

true

represents the predicted

probability for the true class y

3.1 Data Collection, Annotation and

Preprocessing

For the dataset, we focused on a signiﬁcant geopolit-

ical event of the Fall of Kabul in the year 2021. The

tweets in this dataset are labelled with tweet labels as

Information (I), Emotion (E), Neutral (N), and Irrel-

evant (X). We curated a selection of labelled tweets

from this dataset with at least one reply, ensuring en-

gagement within the conversation. We did not con-

sider the Irrelevant tweets. We extracted only the En-

glish tweets along with their corresponding English

replies. This resulted in the ﬁnal dataset comprising

of 495 tweets along with their replies. As a tweet can

have multiple replies, our ﬁnal dataset is comprised of

2864 tweet-reply pairs. The resulting dataset has 200

emotion class tweets with 760 replies, 163 informa-

tion class tweets with 1180 replies, and 132 neutral

class tweets with 924 replies.

For each tweet-reply pair, annotations were done

by providing a reply support label as supporting,

unbiased, or opposing. The annotation process was

meticulously carried out by two independent human

annotators. To maintain inter-annotator reliability, the

annotated datasets were cross-checked by the annota-

tors for consistency and discrepancies were resolved

through mutual discussion. The labelling resulted in

Figure 1: Distribution of supporting, unbiased and oppos-

ing replies across the informative, emotional and neutral

tweets.

943 supporting instances, 1101 unbiased instances,

and 820 opposing instances as shown in Figure 1.

To prepare the data for analysis, we focused exclu-

sively on the textual content of the tweets and replies.

All emojis were converted into their corresponding

textual meanings and hashtags were split into their

component words. The dataset will be made publicly

available after publication of the paper.

3.2 Framework for Classiﬁcation of

Reply Support Labels

For the classiﬁcation of the reply support label, we

explored two frameworks, each designed to optimize

the accuracy of reply support classiﬁcation.

3.2.1 Sentence/Word-Level LSTM Features for

Classiﬁcation

In this framework, we utilized LSTM modules for

extraction of features. We employed two options:

sentence-level and word-level features. For both ap-

proaches, we generated embeddings for the tweet

and the reply using the GloVe technique which were

passed through two identical LSTM models as shown

in Figure 2a. Each model processed the entire text

sequence as a whole.

For sentence-level approach, the LSTM models

produced two distinct feature vectors, z

and z

, cor-

responding to the tweet and the reply. The vectors

were concatenated to form a combined vector, z

, and

sent as input to a network of dense layers for the ﬁ-

nal classiﬁcation. This concatenation step allows the

ﬁnal classiﬁcation model to consider both the tweet

and the reply simultaneously.

For word-wise feature extraction approach, the

embeddings were passed through separate LSTM

models, which generated feature matrices rather than

single vectors (refer Figure 2a). The resulting feature

Stance Detection in Twitter Conversations Using Reply Support Classiﬁcation

237

(a) LSTM-based classiﬁcation.

(b) BERT-based classiﬁcation.

Figure 2: Frameworks for classiﬁcation of reply support labels. (a) Sentence-level and word-level feature extraction from

GLoVE embeddings of tweet and reply using LSTM. (b) Feature extraction from tweet and reply texts using BERT architec-

ture.

matrices, Z

and Z

for tweet and reply respectively.

and Z

were concatenated to form a combined ma-

trix Z

. Z

was ﬂattened and sent as input through

dense layers to classify the stance of the reply.

3.2.2 Attention-Based BERT for Classiﬁcation

In this framework, we leverage the state-of-the-art

BERT (Devlin, 2018) model to perform both em-

bedding generation and classiﬁcation. This approach

concatenated the tweet and reply text into a single se-

quence for generating embeddings and features in a

uniﬁed context. The BERT model outputs a single

embedding of vector size 768 which was then directly

fed into a classiﬁcation layer to predict the reply sup-

port label.

ICPRAM 2025 - 14th International Conference on Pattern Recognition Applications and Methods

238

Table 1: Distribution of Classes in Training, Validation, and

Test Sets across emotional (E), informative (I), and neutral

(N) tweet classes.

Tweet Reply Support Label

Label −1 0 1

Training

E 116 205 307

I 206 464 259

N 334 212 188

Validation

E 13 19 33

I 25 71 26

N 44 20 35

Test

E 13 22 32

I 20 68 41

N 49 20 22

4 EXPERIMENTAL SETUP

We used 2864 tweet-reply pairs for our experiments

as explained in Section 3.1. These pairs are labelled

as supporting (+1), unbiased (0), or opposing (-1)

for the reply support label. The dataset was split

into training, validation, and test sets in the ratio of

80 : 10 : 10 using a stratiﬁed approach. Table 1 shows

the distribution of the tweet-reply pairs according to

their reply support labels and across the tweet label

classes. We used two frameworks for the classiﬁca-

tion task as presented in Section 3.2. To evaluate how

well our models classify the reply support label, we

used Precision, Recall, F1-score, and Accuracy.

4.1 Setup for LSTM-Based

Classiﬁcation

Two LSTM models, each with 64 units and a dropout

rate of 0.3 were used to process the tweet and the re-

ply in both sentence-level and word-level feature ex-

traction. For sentence-level features, the concatenated

feature vector output from the LSTMs served as an

input to a dense layer of 64 units activated by ReLU,

with a batch size of 8. A dropout of 0.2, batch nor-

malization and L2 regularization were applied.

For word-level features, the concatenated output

was sent to three dense layers composed of 2048,

512, and 64 units respectively with ReLU activation.

Batch normalization and a dropout rate of 0.2 was em-

ployed. In both cases, networks were trained using the

Adam optimizer with a learning rate of 1e − 4.

4.2 Setup for BERT-Based

Classiﬁcation

The concatenated text served as input to the BERT

model for feature extraction. The output of the BERT

model was sent through a dense layer with 256 units

with ReLU activation. Batch normalization and L2

regularization and dropout were applied. A dense

layer with softmax activation was used for the ﬁ-

nal classiﬁcation. The model was trained for 150

epochs using the Adam optimizer with a learning rate

of 1e − 5 using sparse categorical cross-entropy loss

function.

5 RESULTS AND DISCUSSION

The results for the classiﬁcation of reply support

label for the tweet-reply pairs are presented in Table

5.1 Results for LSTM Models

The sentence-level features from LSTMs obtained an

overall accuracy of 64.1%. It achieved the highest

precision of 0.697 for the opposing class though the

recall for this class was low. For the unbiased class,

the model achieved a strong recall value of 0.736 and

an F1-score of 0.661. The model achieved the lowest

F1-score of 0.622 for the opposing class. The con-

fusion matrix for the framework is shown in Figure

3a. Few opposing and supporting replies have been

identiﬁed as unbiased replies.

For word-level feature extraction, the overall ac-

curacy was 65.2%. Individual class metrics revealed

that the model’s performance on all metrics was fairly

similar for all classes. The confusion matrix for the

framework is shown in Figure 3b. Few opposing

replies were falsely labelled as supporting or unbi-

ased. Some supporting replies were identiﬁed as un-

biased replies.

5.2 Results for BERT Model

BERT classiﬁcation signiﬁcantly outperformed

LSTM models. It achieved an overall accuracy of

73.2%. The opposing class showed a high recall of

0.793. There was improvement seen in the unbi-

ased class also The supportive class achieved high

precision at 0.767 and an F1-score of 0.729, demon-

strating relatively well-balanced classiﬁcation for all

classes. The confusion matrix for this framework is

given in Figure 3c. It shows the robustness of the

Stance Detection in Twitter Conversations Using Reply Support Classiﬁcation

239

Table 2: Performance Metrics for Sentence-level LSTM Features, Word-level LSTM Features and BERT Features for Classi-

ﬁcation of Reply Support Label.

Framework Support Label

Performance Metrics

Precision Recall F1-score Accuracy

Sentence-Level LSTM −1 0.697 0.561 0.622

0.641Features 0 0.600 0.736 0.661

1 0.663 0.600 0.630

Word-Level LSTM −1 0.675 0.634 0.654

0.652Features 0 0.640 0.646 0.643

1 0.647 0.674 0.660

BERT Embeddings −1 0.670 0.793 0.726

0.7320 0.760 0.718 0.738

1 0.767 0.695 0.729

(a) Sentence-level LSTM Features. (b) Word-level LSTM Features. (c) BERT Embeddings.

Figure 3: Confusion matrices for classiﬁcation using: (a) Sentence-level features extracted from LSTM (b) Word-level features

extracted from LSTM (c) Embeddings from BERT.

Figure 4: Results of employing LIME on the BERT model for classiﬁcation of a speciﬁc reply to a tweet. The weights

assigned to the words contribute to the decision making process of the model in classifying the reply support label.

ICPRAM 2025 - 14th International Conference on Pattern Recognition Applications and Methods

240

Table 3: Stance detection on varied events in current research. Our proposed model works on a more granular level by

determining stance of replies in conversation threads.

Author Objective Stance Dataset Models Results

(Mutlu et al., 2020)

Stance on neutral, COVID-CQ Logistic Accuracy:

treatment against, dataset of regression 0.76

for Covid-19 favor 14374 tweets

(Hamad et al., 2022)

Stance on online agree, StEduCov BERT Accuracy:

education during disagree, dataset of 68%

Covid-19 neutral 16, 572 tweets

(Zhang et al., 2024)

Stance on US pro-Biden, Self curated DoubleH Accuracy:

presidential pro-Trump dataset of 0.8579 ±0.04

elections 1, 123, 749 tweets

Proposed

Stance of supporting

Twitter dataset Sentence-Level F1-score:

Reply support label

reply in tweet unbiased

of event LSTM 0.637

conversations opposing

“Fall of Kabul”, Word-Level F1-score:

2864 tweet-reply LSTM 0.652

pairs BERT F1-score:

0.731

model in correctly classifying opposing and unbiased

responses with moderate misclassiﬁcation between

unbiased and supportive replies. To summarize,

the results indicate that BERT provides the most

effective classiﬁcation of the reply support labels,

which can play a crucial role in understanding social

interactions in crisis contexts.

5.3 Results for Explainable BERT

Using LIME

To provide insights into the model’s decision-making

process, we focus on enhancing the interpretability of

our BERT-based classiﬁcation model by employing

LIME (Local Interpretable Model-agnostic Explana-

tions). Figure 4 demonstrates the results generated by

LIME for the following tweet-reply pair.

Tweet: This is unbelievable, potentially lethal

negligence on the part of the Morrison Government

towards Afghans who have stood with Australia

over a decade. I

aC™ve been pleading with the

government to bring them to Australia for months.

Where the hell is Dutton.https://t.co/2OrI4HcfTU

Reply: https://theguardian.com/australia-

news/2020/nov/19/australian-special-forces-

involved-in-of-39-afghan-civilians-war-crimes-

report-alleges Australian elite soldiers recently killed

39 unarmed Afghanistan civilians. Probably more

went unreported. And you expect them to trust

Australia?!?

The visualization demonstrates that the model has

classiﬁed the reply as opposing with a high probabil-

ity of 0.89. The probabilities for unbiased and sup-

porting classes are very low, being 0.06 and 0.05 re-

spectively. This suggests that the model is quite con-

ﬁdent about its decision. LIME also highlighted that

words such as {trust, expect, negligence, soldiers}

highly inﬂuenced the classiﬁcation of the reply stance

as opposing, being assigned weights of 0.21, 0.20,

0.17 and 0.14 respectively. Words such as {killed, sol-

diers} contributed towards the supporting class also

and {reported} was responsible for aiding the unbi-

ased class, but the overall inﬂuence is minimal.

5.4 Discussion of State-of-Art

Usually stance detection has been done in totality

for speciﬁc events/topics. Few of these works are

shown in Table 3. Our research differs from them

as the replies were not considered in those studies.

Our framework extends the work by analyzing replies

of tweets, deepening the granularity of online dis-

courses. (Mutlu et al., 2020) and (Hamad et al., 2022)

reported results for stance on singular tweet classi-

ﬁcations like COVID-19 treatment opinions or on-

line education. (Zhang et al., 2024) predicted the bi-

nary stance context on political tweets. Our contribu-

tion stands out by the fact that it focuses on tweet-

reply pairs, a rather unexplored ﬁeld in stance de-

tection. This conversation analysis can deliver even

more subtle insights during crisis where engagement

in replies signiﬁcantly shapes the discourse. Our best

framework makes use of a BERT-based framework

optimized for the reply support label classiﬁcation

achieves an F1-score of 0.73, showcasing reasonable

performance in the multi-class setting.

Stance Detection in Twitter Conversations Using Reply Support Classiﬁcation

241

6 CONCLUSIONS

This paper proposed a novel idea of determining the

reply support labels of tweets during crises to un-

derstand the conversation dynamics. By evaluating

LSTM and BERT-based models, we demonstrated the

reply stance classiﬁcation in social media conversa-

tions, achieving an F1-score of 0.731, recall of 0.735,

and accuracy of 0.732. The interaction analysis of

tweets and replies is informative in terms of public en-

gagement compared to a sole focus only on the stance

of tweets. This knowledge can contribute towards im-

proved information management and offer actionable

insight for authorities to tailor their strategies in real-

time.

REFERENCES

Abulaish, M., Saraswat, A., and Fazil, M. (2023). A multi-

task learning framework using graph attention net-

work for user stance and rumor veracity prediction.

In Proceedings of the International Conference on Ad-

vances in Social Networks Analysis and Mining, pages

149–153.

ALDayel, A. and Magdy, W. (2021). Stance detection on

social media: State of the art and trends. Inf. Process.

Manage., 58(4).

Bai, N., Meng, F., Rui, X., and Wang, Z. (2023). A multi-

task attention tree neural net for stance classiﬁcation

and rumor veracity detection. Applied Intelligence,

53(9):10715–10725.

Bukar, U. A., Jabar, M. A., Sidi, F., Nor, R. B., Abdullah,

S., and Ishak, I. (2022). How social media crisis re-

sponse and social interaction is helping people recover

from covid-19: an empirical investigation. Journal of

computational social science, pages 1–29.

Devlin, J. (2018). Bert: Pre-training of deep bidirec-

tional transformers for language understanding. arXiv

preprint arXiv:1810.04805.

Hamad, O., Hamdi, A., Hamdi, S., and Shaban, K. (2022).

Steducov: an explored and benchmarked dataset on

stance detection in tweets towards online education

during covid-19 pandemic. Big Data and Cognitive

Computing, 6(3):88.

Haouari, F. and Elsayed, T. (2024). Are authorities denying

or supporting? detecting stance of authorities towards

rumors in twitter. Social Network Analysis and Min-

ing, 14(1):34.

Hardalov, M., Arora, A., Nakov, P., and Augenstein,

I. (2021). A survey on stance detection for mis-

and disinformation identiﬁcation. arXiv preprint

arXiv:2103.00242.

Hochreiter, S. (1997). Long short-term memory. Neural

Computation MIT-Press.

uc¸

uk, D. and Can, F. (2020). Stance detection: A survey.

ACM Computing Surveys (CSUR), 53(1):1–37.

Lai, M., Tambuscio, M., Patti, V., Ruffo, G., and Rosso,

P. (2019). Stance polarity in political debates: A di-

achronic perspective of network homophily and con-

versations on twitter. Data & Knowledge Engineering,

124:101738.

Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., and

Cherry, C. (2016). Semeval-2016 task 6: Detecting

stance in tweets. In Proceedings of the 10th inter-

national workshop on semantic evaluation (SemEval-

2016), pages 31–41.

Mutlu, E. C., Oghaz, T., Jasser, J., Tutunculer, E., Rajabi,

A., Tayebi, A., Ozmen, O., and Garibay, I. (2020). A

stance data set on polarized conversations on Twitter

about the efﬁcacy of hydroxychloroquine as a treat-

ment for COVID-19. Data in brief, 33:106401.

Ng, L. H. X. and Carley, K. M. (2022). Pro or anti? a so-

cial inﬂuence model of online stance ﬂipping. IEEE

Transactions on Network Science and Engineering,

10(1):3–19.

Villa-Cox, R., Kumar, S., Babcock, M., and Carley, K. M.

(2020). Stance in replies and quotes (srq): A new

dataset for learning stance in twitter conversations.

arXiv preprint arXiv:2006.00691.

Zeng, L., Starbird, K., and Spiro, E. (2016). # unconﬁrmed:

Classifying rumor stance in crisis-related social me-

dia messages. In Proceedings of the international

aaai conference on web and social media, volume 10,

pages 747–750.

Zhang, C., Zhou, Z., Peng, X., and Xu, K. (2024). Doubleh:

Twitter user stance detection via bipartite graph neural

networks. In Proceedings of the International AAAI

Conference on Web and Social Media, volume 18,

pages 1766–1778.

Zheng, J., Baheti, A., Naous, T., Xu, W., and Ritter, A.

(2022). Stanceosaurus: Classifying stance towards

multicultural misinformation. In Proceedings of the

2022 conference on empirical methods in natural lan-

guage processing, pages 2132–2151.

Zubiaga, A., Kochkina, E., Liakata, M., Procter, R., and

Lukasik, M. (2016). Stance classiﬁcation in ru-

mours as a sequential task exploiting the tree struc-

ture of social media conversations. arXiv preprint

arXiv:1609.09028.

ICPRAM 2025 - 14th International Conference on Pattern Recognition Applications and Methods

242