Measuring Context Change to Detect Statements Violating the Overton

Window

Christian Kahmann and Gerhard Heyer

Department for Natural Language Processing, Leipzig University, Augustusplatz 10, Leipzig, Germany

Keywords:

Text Mining, Humanities, Semantic Change.

Abstract:

The so-called Overton window describes the phenomenon that political discourse takes place in a narrow

window of terms that reﬂect the public consensus of acceptable opinions on some topic. In this paper we

present a novel NLP approach to identify statements in a collection of newspaper articles that shift the borders

of the Overton window at some period of time, and apply it on German newspaper texts detecting extreme

statements about the refugee crisis in Germany.

1 MOTIVATION

A central task in the Digital Humanities is to trans-

form a humanities or social science research ques-

tion into a format where digital data and computa-

tional analyses become key components of the re-

search. In what follows we shall consider the very no-

tion of Overton window in political science, and dis-

cuss how this notion can be applied in the context of

media analyses using suitable text mining methods.

The Overton window is a theory which aims to ex-

plain why at some time the consensus of socially ac-

cepted opinions and statements changes, and what is

behind the dynamics of political discourse (Lehman,

2018). In detail, it describes the phenomenon that po-

litical discourse takes place in a narrow window of

terms that reﬂect the public consensus of acceptable

opinions on some topic. The Overton window has

recently attracted public attention in connection with

the growth of populism in the USA and Europe notic-

ing that populists have shifted the Overton window of

acceptable public discourse rightwards (Bugaric and

Kuhelj, 2018). In this paper we will introduce an ap-

proach of how to automatically identify newspaper ar-

ticles and with that, statements of politicians, which

displace the Overton window for a predeﬁned topic

(set of words describing a political issue). In order

to solve this complex task, we will use several steps

using different techniques of Natural Language Pro-

cessing.

2 REALIZATION

The basic idea of our approach is to identify, for a

set of words and period of time (reference), the stan-

dard context of words. Using this as a reference,

we evaluate new documents with respect to their co-

occurrence behaviour in relation to the learned ref-

erence context. If they contain a high proportion of

words that have never been used, or used differently,

in the reference, it is more likely that they reﬂect a

new (and possibly extreme) opinion, thus serving as

an indicator of an extreme opinion or statement shift-

ing the Overton window.

2.1 Data Set

The basis for our approach are newspaper texts from

the German daily newspaper taz

. We use articles

published in the time between 2010 and 2018 pro-

ducing a set of 339367 documents. The discourse

of interest in this paper is the refugee crisis in Ger-

many. Hence we limit the set of documents to those

that deal with refugees in any manner, resulting in a

set of 37966 documents.

2.2 Identify Target Words

At ﬁrst we need to deﬁne a set of words which repre-

sent our target topic. To achieve that, we calculate co-

occurrence statistics based on a sentence term matrix

http://www.taz.de/

392

Kahmann, C. and Heyer, G.

Measuring Context Change to Detect Statements Violating the Overton Window.

DOI: 10.5220/0008191803920396

In Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2019), pages 392-396

ISBN: 978-989-758-382-7

of the reduced corpus. Afterwards we extract syn-

onyms for the word ”ﬂ

uchtling” (refugee) using the

cosine similarity based on the co-occurrence vectors.

We are using Dice signiﬁcance.

Table 1: Extracted synonyms (target words) for source word

”ﬂ

uchtling” and the corresponding cosine similarity.

word cosine similarity

ﬂ

uchtling 1

syrer 0.279

geﬂ

uchtete 0.2626

asylbewerber 0.275

migrant 0.239

asylsuchender 0.228

ausl

ander 0.196

2.3 Learn Reference

Having done this, we set a reference range of time

which will serve as our test set. In more detail, we

used 8067 documents in the time between 2013-01-

01 to 2014-12-31 concerning refugees. This resulted

in 386632 sentences. We then applied pre-processing

(tolower, stopword-, number-, hyphenation removal).

We used uni- and bi grams for our analysis. A sen-

tence term matrix stm was created based on the ﬁn-

ished pre-processing. This resulted in a set of 106285

features. In order to model the standard context of

multiple words with one variable, we created a syn-

thetic column representing the set of target words. In

this column an entry is either 0 or 1. The calculation

is shown in formula 1.

stm

i,syn

(

1 if

∑

k=1

stm

(i,k)

≥ 1

0 otherwise

m = number of words in the target set

(1)

Next we calculated the co-occurrence signiﬁcance for

the synthetic word that represents our target set of

words. The result is a vector with a length equal to

the vocabulary size, whereby each entry corresponds

to the signiﬁcance value for the particular word and

our target set of words. The most likely words to

occur together in the same context with refugees are

shown in table 2. The calculated dice signiﬁcance val-

ues were normalized to improve the functionality of

the distance measure.

Table 2: Top co-occurrences of target word set with their

dice signiﬁcance and the normalized reference weight.

word

dice

signiﬁ-

cance

weight

deutschland 0.061 1

oranienplatz 0.059 0.976

berlin 0.043 0.711

leben 0.043 0.704

schule 0.041 0.669

senat 0.037 0.61

unterbringung 0.034 0.563

hamburg 0.033 0.549

zahl 0.033 0.542

italien 0.032 0.524

syrien 0.031 0.513

land 0.031 0.51

migration 0.027 0.451

millionen 0.026 0.432

untergebracht 0.026 0.43

migration ﬂ

uchtlinge 0.025 0.413

europa 0.025 0.405

derzeit 0.024 0.4

stadt 0.024 0.397

gruppe 0.024 0.394

aufnahme 0.024 0.391

aufnehmen 0.023 0.383

lampedusa 0.023 0.377

syrische 0.023 0.37

taz 0.022 0.363

polizei 0.022 0.361

bremen 0.021 0.353

ﬂ

uchtlinge oranienplatz 0.021 0.353

spd 0.021 0.35

syrische ﬂ

uchtlinge 0.021 0.343

bezirk 0.021 0.341

bleiben 0.021 0.34

bundesamt 0.021 0.338

unterkunft 0.02 0.33

unterbringung ﬂ

uchtlingen 0.02 0.325

wohnungen 0.019 0.316

bundesamt migration 0.019 0.315

2.4 Apply Distance Measure

We deﬁne distance as the amount of unpredictability

for a given sentence. A measure is used, which is

close to Context Volatility (Kahmann et al., 2017),

to calculate a score for every sentence in the test

set quantifying how much this sentence differs from

normalized weights, reﬂecting the relative likelihood of a

word appearing together with the target words inside a sen-

tence

Measuring Context Change to Detect Statements Violating the Overton Window

393

what was seen in the reference corpus. In contrast

to Context Volatility (CV) we are measuring the con-

text change for a set of words for a single sentence

whereby CV works on a single word on a set of doc-

uments over time. The distance for the i

th sentence s

is calculated by using the combination of sum φ and

mean τ of word distances per sentence:

∑

j=1

(1 − re f (w

))

(2)

∑

j=1

(1 − re f (w

)) (3)

with j words w in s

and re f (w

) representing the

normalized reference signiﬁcance weight for the j

word in sentence s

. Using the mean solely, sen-

tences consisting of only one unknown word would

be ranked highest. Taking the sum of words distances

only, would lead to a preference for very long sen-

tences. Therefore, we combine both values using a

factor λ. To make the combination work, we need to

set the range for the sum of word distances per sen-

tence to same interval as the mean values [0, 1].

norm

max(φ

)

(4)

Our results are created with λ = 0.90. We use a big

λ, because the calculated values for mean

are very

dense (µ = 0.923, σ = 0.059). Whereas the sums re-

veal a much higher variation (µ = 0.186, σ = 0.105).

= λ ∗ τ

+ (1 − λ) ∗ φ

norm

(5)

The range of the results is located between 0 and 1.

We see a maximal distance of 1, when a sentence

with the maximum number of words (in the test set)

only contains words that have never co-occurred with

any one of the target words in the reference corpus.

The distance for a sentence that only uses the word

”Deutschland” together with one of the target words

(reference weight 1) is 0.

Table 3: Example sentence: ”Sie sind Fl

uchtlinge vom

Oranienplatz.” The missing words are reduced during the

pre-processing in the step of stopword removal. Calculated

distance for this sentence is 0.3325, and with that it has the

shortest distance relating to the learned reference.

word oranienplatz ﬂ

uchtlinge oranienplatz

distance 0.024 0.647

2.5 Indicators for Statements

Applying the algorithm enables the discovery of sen-

tences that deviate from the ”standard” as deﬁned for

a certain topic. In this paper we are especially inter-

ested in statements from politicians or other promi-

nent agents. We therefore limit our analysis to sen-

tences that have an indicator for including a state-

ment. The presence of words like ”sagte”(said), ”be-

hauptet”(asserted) and also quotations marks are ap-

plied as a ﬁlter. This reduced our analysis base from

overall 14894 sentences containing at least one of the

target words, to 3126 sentences containing both, tar-

get word(s) and statement indicator during the deﬁned

test period of time.

3 RESULTS

We apply the shown mechanism on articles in the time

from 2015-01-01 to 2015-12-31.

Table 4: Result excerpt of sentences with highest and lowest

distance.

sentence s

distance

Das Online-Magazin Vice titelt

”Warnsch

usse gegen Fl

uchtlinge :

0.9909

uchtlinge wollen vom Tellerw

ascher

zum Million

ar werden, da ist keine Zeit

ur irgendein Ministeramt.

0.9909

“Oft ging es um Fl

uchtlinge, die von

Schleuserbanden ausgeraubt oder

zusammengeschlagen wurden.

0.9907

Als Motiv gab der Feuerwehrmann an,

keine Fl

uchtlinge in seinem

Wohnumfeld haben zu wollen.

0.9907

Clowns qu

alen Fl

uchtlinge.

0.9904

Es gibt immer mehr davon,

Bildungserfolg von Migranten ist doch

nichts Exotisches mehr.

0.9904

Jetzt wollen diese Schreckensgestalten

auch noch bedauernswerte Fl

uchtlinge

alen.

0.9903

”Er hat mit dem T

otungsdelikt an dem

Asylbewerber aber nichts zu tun.

0.9902

... ...

Syrer sollen nicht alle

nach Deutschland kommen.

0.4812

”Jeder dritte Syrer und jeder f

unfte

Afrikaner will nach Deutschland”,

sagt er.

0.4762

Wie viele Fl

uchtlinge sind in

Deutschland?

0.3966

Sie sind Fl

uchtlinge vom Oranienplatz.

0.3325

We are able to ﬁnd sentences with words, that

reveal new contexts. With this basic approach we

can’t guarantee that every found sentence is extreme

KDIR 2019 - 11th International Conference on Knowledge Discovery and Information Retrieval

394

in the sense of a statement outside the Overton win-

dow (see table 4: ”Bildungserfolg von Migranten”

- Educational success of immigrants). Nevertheless

we ﬁnd a high amount of sentences and especially

statements, that seem to exceed the boundaries of

the Overton window in extreme manner (see table

5: ”Warnsch

usse gegen Fl

uchtlinge” - Warningshots

versus refugees).

3.1 Reduction on Statements

The same can be said for the found statements. The

ﬁltering for ﬁnding statements works. The found

sentences reﬂect opinions and statements of different

agents. Many of them tend to be interesting in the

concern of ﬁnding extreme assertions.

Table 5: Result excerpt after ﬁltering for statements of sen-

tences with highest and lowest distances.

sentence s

distance

Das Online-Magazin Vice titelt

”Warnsch

usse gegen Fl

uchtlinge :

0.9909

“Oft ging es um Fl

uchtlinge, die von

Schleuserbanden ausgeraubt oder

zusammengeschlagen wurden.

0.9907

“Er hat mit dem T

otungsdelikt an dem

Asylbewerber aber nichts zu tun.

0.9902

Auf Ausl

ander schmeißt man Steine

“Wir sollten gar nicht hier sein“,

sagt Phillipp.

0.9901

Diese Migranten sind wie Kakerlaken“,

steht schon in der Unterzeile.”

0.9901

“Ich nehm die Asylbewerber mit“,

sagt ein Passant.

0.9892

“Wir k

onnen Ausl

ander klatschen!

0.9891

Premierminister David Cameron

bezeichnete die Migranten als

“Menschenschw

arme“.

0.9889

... ...

”Aber ich habe in den Nachrichten

geh

ort,dass jeder dritte Afrikaner

und jeder f

unfte Syrer nach

Deutschland will”.

0.7512

” Immer mehr Fl

uchtlinge leben

deshalb illegal im Land“, sagt Beuze.

0.7258

” Deutschland, Deutschland“,

skandieren die Fl

uchtlinge.

0.6533

”Jeder dritte Syrer und jeder f

unfte

Afrikaner will nach Deutschland“,

sagt er.

0.4765

4 LIMITATIONS

A big problem that must be dealt with in further work

is the fact that the use of new words does not al-

ways necessarily reﬂect an opinion outside the Over-

ton window. Often co-occurrences are given a high

distance value simply because this combination was

not used in the previous period. The non-use, how-

ever, cannot be attributed to the fact that it was po-

litically incorrect to make a corresponding statement

during the reference period, but rather to random ef-

fects. This effect increases as the amount of text de-

creases.

In the efforts so far, only the changes from a refer-

ence period to a test period have been considered.

According to the theory of the Overton window, how-

ever, statements outside the window should also shift

the future public opinion in the direction, which the

statement indicates. Therefore, the next step would

be to estimate this shift. A possible approach might

be looking at a time window after a statement, which

shows a large distance to the reference. If a similar be-

havior is found in the following documents, the state-

ment may have a high impact on public opinion and

was able to shift the Overton window and hence is an

even more interesting statement to extract.

The 2-dimensional orientation of the Overton win-

dow, which models political extremes (left vs. right,

pro-refugee vs. contra-refugee...) has not yet been

considered. So far, changes from the reference have

not been classiﬁed into political extremes. Only the

deviation itself has been used for further analysis.

The categorization of statements into political posi-

tions would require a better semantic understanding

of statements. Here various approaches might be con-

ceivable (e.g. SVM-classiﬁcation, semantic embed-

dings...).

Another exciting approach is the diachronic view of

the Overton window. To specify exactly what seems

to be socially accepted according to data at what time

and what is not, are thrilling questions. Also to know

at what time the window changes very strongly and at

what times it remains stable, is exciting.

In addition it is imperative to develop an evalua-

tion possibility for the described measurements of the

Overton window. As there is no gold standard yet

which places political statements in the dimensions

of the Overton window. In order to check the validity

of the calculations, a synthetic data set could be used

which adequately models the underlying dynamics of

the Overton window. However, the ﬁnal evaluation

will always require the cooperation and judgment of

an domain expert (e.g. political scientist).

Measuring Context Change to Detect Statements Violating the Overton Window

395

In summary, it can be said that the work on text data

with the basic assumption of the Overton window pro-

vides many exciting analyses. However, these have to

be processed in future work. Nevertheless, the ﬁrst

results already show promising results.

5 FURTHER WORK

One of the next changes to be made is the shift from

co-occurrence statistics to the use of embeddings. At

the moment we are dependent on the mere presence

and identical use of a number of ”learned” terms from

the reference period. By using embeddings in gen-

eral and sentence embeddings(Wieting et al., 2016) in

particular, we hope to be able to solve some of the

mentioned limitations. So hopefully it would be pos-

sible to assign political statements to different camps

using embeddings. The use of sentence embeddings

would also assist with the previous problems of differ-

ent sentence lengths during the distance calculation

by being able to use a distance measure for vectors

with the same dimensions (no matter the length of the

sentence).

6 CONCLUSION

The mechanism uses basic co-occurrence statistics

and nonetheless enables the detection of unusual con-

texts around a set of target words. With that, we en-

able locating statements, that may exceed the limits

of the Overton window and as a consequence shift the

political discourse in the society. More advanced ap-

proaches like sentence embeddings (Mikolov et al.,

2013), might be able to generate even more reason-

able results. The segmentation of the resulted sen-

tences towards one of the two basic attitudes about a

political discourse is an issue which needs to be ad-

dressed in future work.

REFERENCES

Bugaric, B. and Kuhelj, A. (2018). Varieties of populism in

europe: Is the rule of law in danger? Hague Journal

on the Rule of Law, 10(1):21–33.

Kahmann, C., Niekler, A., and Heyer, G. (2017). Detect-

ing and assessing contextual change in diachronic text

documents using context volatility. In Fred, A. L. N.

and Filipe, J., editors, Proceedings of the 9th Inter-

national Joint Conference on Knowledge Discovery,

Knowledge Engineering and Knowledge Management

- (Volume 1), Funchal, Madeira, Portugal, November

1-3, 2017., pages 135–143. SciTePress.

Lehman, J. (2018). A brief explanation of the overton win-

dow.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and

Dean, J. (2013). Distributed representations of

words and phrases and their compositionality. CoRR,

abs/1310.4546.

Wieting, J., Bansal, M., Gimpel, K., and Livescu, K. (2016).

Towards universal paraphrastic sentence embeddings.

In ICLR.

KDIR 2019 - 11th International Conference on Knowledge Discovery and Information Retrieval

396