
3.1 Data Acquisition Module
The first step in the process is the Data Acquisition
Module, which is responsible for the automatic re-
trieval of textual data from diverse sources, includ-
ing digital platforms (social media, online forums)
and transcriptions of spoken interactions (e.g., parlia-
mentary debates). This module is designed to pro-
cess various types of input data, ensuring scalability
and adaptability to different environments and lan-
guages. By encompassing multiple sources, it allows
the framework to operate on both real-time streams
and historical archives, making it applicable to a wide
range of use cases, from online content moderation to
retrospective analysis of political speeches.
The acquisition process includes mechanisms to
handle noise in the data, such as filtering non-relevant
content and pre-processing steps like tokenization and
removal of stop words. This ensures that only perti-
nent text data are fed into subsequent modules, im-
proving the overall efficiency and accuracy of the sys-
tem.
3.2 Language Unification Module
Given the multilingual nature of modern discourse,
particularly in political settings such as the Valencian
Parliament, the Language Unification Module plays
a crucial role in standardizing text data before fur-
ther processing. This module automatically detects
the language of incoming text and translates it into a
unified language format (in this case, Spanish), en-
abling consistent and reliable analysis.
The challenge of multilingual text is twofold:
lexical variations across languages and the differing
cultural connotations of certain words. The Lan-
guage Unification Module addresses both by lever-
aging state-of-the-art machine translation models that
are sensitive to context. This ensures that translations
maintain the semantic integrity of the original speech,
preserving subtle nuances that are essential for accu-
rate toxicity detection. By addressing linguistic dis-
crepancies at this early stage, the framework is able
to avoid errors that might arise from handling multi-
ple languages simultaneously, ensuring a more coher-
ent and reliable toxicity assessment downstream.
3.3 Toxic Detection Module
The Toxic Detection Module serves as the core com-
ponent of the framework, responsible for evaluating
the toxicity of the standardized text. This module
utilizes advanced AI techniques, specifically binary
classification models, to determine whether a given
piece of text is likely to be toxic. In this study, we
employed the pre-trained Detoxify model (Hanu and
Unitary, 2023), which has demonstrated strong per-
formance in various toxic speech detection tasks.
Detoxify assigns a probability score to each text
sample, representing the likelihood that the content is
toxic. A score close to 0 indicates non-toxic content,
while a score closer to 1 suggests toxic speech. How-
ever, despite its effectiveness, the module can some-
times produce ambiguous results, especially when the
probability score falls within a certain range where the
classification is not definitively clear. To address this,
we introduce a predefined ”confusion zone,” typically
ranging between 42% and 58%. Texts with scores
within this zone are neither clearly toxic nor clearly
non-toxic, indicating a need for further analysis to
reach a confident conclusion.
This confusion zone is particularly relevant in the
context of political discourse, where language is of-
ten nuanced and may involve sarcasm, rhetorical de-
vices, or indirect speech. Such complexities can make
it difficult for the model to assign a clear classifica-
tion, resulting in borderline cases. To handle these,
the system flags these texts for additional processing
by the Sentiment Analysis Module, which provides
further contextual understanding to improve classifi-
cation accuracy.
3.4 Sentiment Analysis Module
To handle these uncertain cases, we introduce a sec-
ond layer of analysis: the Sentiment Analysis Module.
Once a text is flagged by the Toxic Detection Module
as ambiguous, it is redirected to this module for fur-
ther evaluation. The Sentiment Analysis Module per-
forms a deeper analysis of the emotional tone, provid-
ing additional context to aid in the final classification.
By assessing the emotional valence of the text, the
sentiment analysis adds a nuanced layer of interpre-
tation that binary classifiers typically overlook. For
instance, a politically charged statement with a highly
negative sentiment score is more likely to be toxic,
even if the initial classifier was uncertain. On the
other hand, a text with low sentiment intensity may
indicate sarcasm or rhetorical neutrality, reducing the
likelihood of it being classified as toxic.
This dual-layered approach—combining toxicity
classification with sentiment analysis—enables the
system to handle complex and ambiguous language
more effectively. Texts with strong negative sentiment
are reclassified as toxic, while those with more neu-
tral or positive sentiment are deemed non-toxic, thus
significantly reducing false positives and negatives in
the overall classification process.
Sentiment-Enriched AI for Toxic Speech Detection: A Case Study of Political Discourses in the Valencian Parliament
557