Measuring Context Change to Detect Statements Violating the Overton
Window
Christian Kahmann and Gerhard Heyer
Department for Natural Language Processing, Leipzig University, Augustusplatz 10, Leipzig, Germany
Keywords:
Text Mining, Humanities, Semantic Change.
Abstract:
The so-called Overton window describes the phenomenon that political discourse takes place in a narrow
window of terms that reflect the public consensus of acceptable opinions on some topic. In this paper we
present a novel NLP approach to identify statements in a collection of newspaper articles that shift the borders
of the Overton window at some period of time, and apply it on German newspaper texts detecting extreme
statements about the refugee crisis in Germany.
1 MOTIVATION
A central task in the Digital Humanities is to trans-
form a humanities or social science research ques-
tion into a format where digital data and computa-
tional analyses become key components of the re-
search. In what follows we shall consider the very no-
tion of Overton window in political science, and dis-
cuss how this notion can be applied in the context of
media analyses using suitable text mining methods.
The Overton window is a theory which aims to ex-
plain why at some time the consensus of socially ac-
cepted opinions and statements changes, and what is
behind the dynamics of political discourse (Lehman,
2018). In detail, it describes the phenomenon that po-
litical discourse takes place in a narrow window of
terms that reflect the public consensus of acceptable
opinions on some topic. The Overton window has
recently attracted public attention in connection with
the growth of populism in the USA and Europe notic-
ing that populists have shifted the Overton window of
acceptable public discourse rightwards (Bugaric and
Kuhelj, 2018). In this paper we will introduce an ap-
proach of how to automatically identify newspaper ar-
ticles and with that, statements of politicians, which
displace the Overton window for a predefined topic
(set of words describing a political issue). In order
to solve this complex task, we will use several steps
using different techniques of Natural Language Pro-
cessing.
2 REALIZATION
The basic idea of our approach is to identify, for a
set of words and period of time (reference), the stan-
dard context of words. Using this as a reference,
we evaluate new documents with respect to their co-
occurrence behaviour in relation to the learned ref-
erence context. If they contain a high proportion of
words that have never been used, or used differently,
in the reference, it is more likely that they reflect a
new (and possibly extreme) opinion, thus serving as
an indicator of an extreme opinion or statement shift-
ing the Overton window.
2.1 Data Set
The basis for our approach are newspaper texts from
the German daily newspaper taz
1
. We use articles
published in the time between 2010 and 2018 pro-
ducing a set of 339367 documents. The discourse
of interest in this paper is the refugee crisis in Ger-
many. Hence we limit the set of documents to those
that deal with refugees in any manner, resulting in a
set of 37966 documents.
2.2 Identify Target Words
At first we need to define a set of words which repre-
sent our target topic. To achieve that, we calculate co-
occurrence statistics based on a sentence term matrix
1
http://www.taz.de/
392
Kahmann, C. and Heyer, G.
Measuring Context Change to Detect Statements Violating the Overton Window.
DOI: 10.5220/0008191803920396
In Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019), pages 392-396
ISBN: 978-989-758-382-7
Copyright
c
2019 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
of the reduced corpus. Afterwards we extract syn-
onyms for the word ”fl
¨
uchtling” (refugee) using the
cosine similarity based on the co-occurrence vectors.
We are using Dice significance.
Table 1: Extracted synonyms (target words) for source word
”fl
¨
uchtling” and the corresponding cosine similarity.
word cosine similarity
¨
uchtling 1
syrer 0.279
gefl
¨
uchtete 0.2626
asylbewerber 0.275
migrant 0.239
asylsuchender 0.228
ausl
¨
ander 0.196
2.3 Learn Reference
Having done this, we set a reference range of time
which will serve as our test set. In more detail, we
used 8067 documents in the time between 2013-01-
01 to 2014-12-31 concerning refugees. This resulted
in 386632 sentences. We then applied pre-processing
(tolower, stopword-, number-, hyphenation removal).
We used uni- and bi grams for our analysis. A sen-
tence term matrix stm was created based on the fin-
ished pre-processing. This resulted in a set of 106285
features. In order to model the standard context of
multiple words with one variable, we created a syn-
thetic column representing the set of target words. In
this column an entry is either 0 or 1. The calculation
is shown in formula 1.
stm
i,syn
=
(
1 if
m
k=1
stm
(i,k)
1
0 otherwise
m = number of words in the target set
(1)
Next we calculated the co-occurrence significance for
the synthetic word that represents our target set of
words. The result is a vector with a length equal to
the vocabulary size, whereby each entry corresponds
to the significance value for the particular word and
our target set of words. The most likely words to
occur together in the same context with refugees are
shown in table 2. The calculated dice significance val-
ues were normalized to improve the functionality of
the distance measure.
Table 2: Top co-occurrences of target word set with their
dice significance and the normalized reference weight.
word
dice
signifi-
cance
weight
2
deutschland 0.061 1
oranienplatz 0.059 0.976
berlin 0.043 0.711
leben 0.043 0.704
schule 0.041 0.669
senat 0.037 0.61
unterbringung 0.034 0.563
hamburg 0.033 0.549
zahl 0.033 0.542
italien 0.032 0.524
syrien 0.031 0.513
land 0.031 0.51
migration 0.027 0.451
millionen 0.026 0.432
untergebracht 0.026 0.43
migration
¨
uchtlinge 0.025 0.413
europa 0.025 0.405
derzeit 0.024 0.4
stadt 0.024 0.397
gruppe 0.024 0.394
aufnahme 0.024 0.391
aufnehmen 0.023 0.383
lampedusa 0.023 0.377
syrische 0.023 0.37
taz 0.022 0.363
polizei 0.022 0.361
bremen 0.021 0.353
¨
uchtlinge oranienplatz 0.021 0.353
spd 0.021 0.35
syrische
¨
uchtlinge 0.021 0.343
bezirk 0.021 0.341
bleiben 0.021 0.34
bundesamt 0.021 0.338
unterkunft 0.02 0.33
unterbringung
¨
uchtlingen 0.02 0.325
wohnungen 0.019 0.316
bundesamt migration 0.019 0.315
2.4 Apply Distance Measure
We define distance as the amount of unpredictability
for a given sentence. A measure is used, which is
close to Context Volatility (Kahmann et al., 2017),
to calculate a score for every sentence in the test
set quantifying how much this sentence differs from
2
normalized weights, reflecting the relative likelihood of a
word appearing together with the target words inside a sen-
tence
Measuring Context Change to Detect Statements Violating the Overton Window
393
what was seen in the reference corpus. In contrast
to Context Volatility (CV) we are measuring the con-
text change for a set of words for a single sentence
whereby CV works on a single word on a set of doc-
uments over time. The distance for the i
0
th sentence s
is calculated by using the combination of sum φ and
mean τ of word distances per sentence:
τ
s
i
=
n
j=1
(1 re f (w
j
))
n
(2)
φ
s
i
=
n
j=1
(1 re f (w
j
)) (3)
with j words w in s
i
and re f (w
j
) representing the
normalized reference significance weight for the j
0
th
word in sentence s
i
. Using the mean solely, sen-
tences consisting of only one unknown word would
be ranked highest. Taking the sum of words distances
only, would lead to a preference for very long sen-
tences. Therefore, we combine both values using a
factor λ. To make the combination work, we need to
set the range for the sum of word distances per sen-
tence to same interval as the mean values [0, 1].
φ
norm
s
=
φ
s
max(φ
s
)
(4)
Our results are created with λ = 0.90. We use a big
λ, because the calculated values for mean
s
i
are very
dense (µ = 0.923, σ = 0.059). Whereas the sums re-
veal a much higher variation (µ = 0.186, σ = 0.105).
d
s
i
= λ τ
s
i
+ (1 λ) φ
norm
s
i
(5)
The range of the results is located between 0 and 1.
We see a maximal distance of 1, when a sentence
with the maximum number of words (in the test set)
only contains words that have never co-occurred with
any one of the target words in the reference corpus.
The distance for a sentence that only uses the word
”Deutschland” together with one of the target words
(reference weight 1) is 0.
Table 3: Example sentence: ”Sie sind Fl
¨
uchtlinge vom
Oranienplatz. The missing words are reduced during the
pre-processing in the step of stopword removal. Calculated
distance for this sentence is 0.3325, and with that it has the
shortest distance relating to the learned reference.
word oranienplatz
¨
uchtlinge oranienplatz
distance 0.024 0.647
2.5 Indicators for Statements
Applying the algorithm enables the discovery of sen-
tences that deviate from the ”standard” as defined for
a certain topic. In this paper we are especially inter-
ested in statements from politicians or other promi-
nent agents. We therefore limit our analysis to sen-
tences that have an indicator for including a state-
ment. The presence of words like ”sagte”(said), ”be-
hauptet”(asserted) and also quotations marks are ap-
plied as a filter. This reduced our analysis base from
overall 14894 sentences containing at least one of the
target words, to 3126 sentences containing both, tar-
get word(s) and statement indicator during the defined
test period of time.
3 RESULTS
We apply the shown mechanism on articles in the time
from 2015-01-01 to 2015-12-31.
Table 4: Result excerpt of sentences with highest and lowest
distance.
sentence s
distance
d
s
Das Online-Magazin Vice titelt
”Warnsch
¨
usse gegen Fl
¨
uchtlinge :
0.9909
Fl
¨
uchtlinge wollen vom Tellerw
¨
ascher
zum Million
¨
ar werden, da ist keine Zeit
f
¨
ur irgendein Ministeramt.
0.9909
“Oft ging es um Fl
¨
uchtlinge, die von
Schleuserbanden ausgeraubt oder
zusammengeschlagen wurden.
0.9907
Als Motiv gab der Feuerwehrmann an,
keine Fl
¨
uchtlinge in seinem
Wohnumfeld haben zu wollen.
0.9907
Clowns qu
¨
alen Fl
¨
uchtlinge.
0.9904
Es gibt immer mehr davon,
Bildungserfolg von Migranten ist doch
nichts Exotisches mehr.
0.9904
Jetzt wollen diese Schreckensgestalten
auch noch bedauernswerte Fl
¨
uchtlinge
qu
¨
alen.
0.9903
”Er hat mit dem T
¨
otungsdelikt an dem
Asylbewerber aber nichts zu tun.
0.9902
... ...
Syrer sollen nicht alle
nach Deutschland kommen.
0.4812
”Jeder dritte Syrer und jeder f
¨
unfte
Afrikaner will nach Deutschland”,
sagt er.
0.4762
Wie viele Fl
¨
uchtlinge sind in
Deutschland?
0.3966
Sie sind Fl
¨
uchtlinge vom Oranienplatz.
0.3325
We are able to find sentences with words, that
reveal new contexts. With this basic approach we
can’t guarantee that every found sentence is extreme
KDIR 2019 - 11th International Conference on Knowledge Discovery and Information Retrieval
394
in the sense of a statement outside the Overton win-
dow (see table 4: ”Bildungserfolg von Migranten”
- Educational success of immigrants). Nevertheless
we find a high amount of sentences and especially
statements, that seem to exceed the boundaries of
the Overton window in extreme manner (see table
5: ”Warnsch
¨
usse gegen Fl
¨
uchtlinge” - Warningshots
versus refugees).
3.1 Reduction on Statements
The same can be said for the found statements. The
filtering for finding statements works. The found
sentences reflect opinions and statements of different
agents. Many of them tend to be interesting in the
concern of finding extreme assertions.
Table 5: Result excerpt after filtering for statements of sen-
tences with highest and lowest distances.
sentence s
distance
d
s
Das Online-Magazin Vice titelt
”Warnsch
¨
usse gegen Fl
¨
uchtlinge :
0.9909
“Oft ging es um Fl
¨
uchtlinge, die von
Schleuserbanden ausgeraubt oder
zusammengeschlagen wurden.
0.9907
“Er hat mit dem T
¨
otungsdelikt an dem
Asylbewerber aber nichts zu tun.
0.9902
Auf Ausl
¨
ander schmeißt man Steine
“Wir sollten gar nicht hier sein“,
sagt Phillipp.
0.9901
Diese Migranten sind wie Kakerlaken“,
steht schon in der Unterzeile.
0.9901
“Ich nehm die Asylbewerber mit“,
sagt ein Passant.
0.9892
“Wir k
¨
onnen Ausl
¨
ander klatschen!
0.9891
Premierminister David Cameron
bezeichnete die Migranten als
“Menschenschw
¨
arme“.
0.9889
... ...
”Aber ich habe in den Nachrichten
geh
¨
ort,dass jeder dritte Afrikaner
und jeder f
¨
unfte Syrer nach
Deutschland will”.
0.7512
” Immer mehr Fl
¨
uchtlinge leben
deshalb illegal im Land“, sagt Beuze.
0.7258
” Deutschland, Deutschland“,
skandieren die Fl
¨
uchtlinge.
0.6533
”Jeder dritte Syrer und jeder f
¨
unfte
Afrikaner will nach Deutschland“,
sagt er.
0.4765
4 LIMITATIONS
A big problem that must be dealt with in further work
is the fact that the use of new words does not al-
ways necessarily reflect an opinion outside the Over-
ton window. Often co-occurrences are given a high
distance value simply because this combination was
not used in the previous period. The non-use, how-
ever, cannot be attributed to the fact that it was po-
litically incorrect to make a corresponding statement
during the reference period, but rather to random ef-
fects. This effect increases as the amount of text de-
creases.
In the efforts so far, only the changes from a refer-
ence period to a test period have been considered.
According to the theory of the Overton window, how-
ever, statements outside the window should also shift
the future public opinion in the direction, which the
statement indicates. Therefore, the next step would
be to estimate this shift. A possible approach might
be looking at a time window after a statement, which
shows a large distance to the reference. If a similar be-
havior is found in the following documents, the state-
ment may have a high impact on public opinion and
was able to shift the Overton window and hence is an
even more interesting statement to extract.
The 2-dimensional orientation of the Overton win-
dow, which models political extremes (left vs. right,
pro-refugee vs. contra-refugee...) has not yet been
considered. So far, changes from the reference have
not been classified into political extremes. Only the
deviation itself has been used for further analysis.
The categorization of statements into political posi-
tions would require a better semantic understanding
of statements. Here various approaches might be con-
ceivable (e.g. SVM-classification, semantic embed-
dings...).
Another exciting approach is the diachronic view of
the Overton window. To specify exactly what seems
to be socially accepted according to data at what time
and what is not, are thrilling questions. Also to know
at what time the window changes very strongly and at
what times it remains stable, is exciting.
In addition it is imperative to develop an evalua-
tion possibility for the described measurements of the
Overton window. As there is no gold standard yet
which places political statements in the dimensions
of the Overton window. In order to check the validity
of the calculations, a synthetic data set could be used
which adequately models the underlying dynamics of
the Overton window. However, the final evaluation
will always require the cooperation and judgment of
an domain expert (e.g. political scientist).
Measuring Context Change to Detect Statements Violating the Overton Window
395
In summary, it can be said that the work on text data
with the basic assumption of the Overton window pro-
vides many exciting analyses. However, these have to
be processed in future work. Nevertheless, the first
results already show promising results.
5 FURTHER WORK
One of the next changes to be made is the shift from
co-occurrence statistics to the use of embeddings. At
the moment we are dependent on the mere presence
and identical use of a number of ”learned” terms from
the reference period. By using embeddings in gen-
eral and sentence embeddings(Wieting et al., 2016) in
particular, we hope to be able to solve some of the
mentioned limitations. So hopefully it would be pos-
sible to assign political statements to different camps
using embeddings. The use of sentence embeddings
would also assist with the previous problems of differ-
ent sentence lengths during the distance calculation
by being able to use a distance measure for vectors
with the same dimensions (no matter the length of the
sentence).
6 CONCLUSION
The mechanism uses basic co-occurrence statistics
and nonetheless enables the detection of unusual con-
texts around a set of target words. With that, we en-
able locating statements, that may exceed the limits
of the Overton window and as a consequence shift the
political discourse in the society. More advanced ap-
proaches like sentence embeddings (Mikolov et al.,
2013), might be able to generate even more reason-
able results. The segmentation of the resulted sen-
tences towards one of the two basic attitudes about a
political discourse is an issue which needs to be ad-
dressed in future work.
REFERENCES
Bugaric, B. and Kuhelj, A. (2018). Varieties of populism in
europe: Is the rule of law in danger? Hague Journal
on the Rule of Law, 10(1):21–33.
Kahmann, C., Niekler, A., and Heyer, G. (2017). Detect-
ing and assessing contextual change in diachronic text
documents using context volatility. In Fred, A. L. N.
and Filipe, J., editors, Proceedings of the 9th Inter-
national Joint Conference on Knowledge Discovery,
Knowledge Engineering and Knowledge Management
- (Volume 1), Funchal, Madeira, Portugal, November
1-3, 2017., pages 135–143. SciTePress.
Lehman, J. (2018). A brief explanation of the overton win-
dow.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and
Dean, J. (2013). Distributed representations of
words and phrases and their compositionality. CoRR,
abs/1310.4546.
Wieting, J., Bansal, M., Gimpel, K., and Livescu, K. (2016).
Towards universal paraphrastic sentence embeddings.
In ICLR.
KDIR 2019 - 11th International Conference on Knowledge Discovery and Information Retrieval
396