of those systems will improve the reliability classifier.
Various suggested improvements are listed in the cor-
responding module sections and discussion chapter in
the thesis.
7.2 Performance on Other Datasets
In this paper, we have shown that we achieved a
good performance on the dataset and fact classes we
have introduced in this paper. A very interesting, and
maybe the most important question after this conclu-
sion is how these features and performance would re-
late to other datasets and other types of facts. Our hy-
pothesis states that our features work well on datasets
similar to the dataset. Our features aim for datasets
in which facts are repeated and originate from dif-
ferent (independent) sources. In this way, the false
facts are countered by a lot of independent other ‘cor-
rect’ sources. Because there are many sources our
facts can originate from, and the facts can be veri-
fied by many sources, the false facts can be countered
by replies. Criticizing falsities can be universal, but
we think there is a lot of difference when people re-
act to fact and when they do not. For example, in this
dataset we have seen that reactions on false scores are
a lot more common than reactions on incorrect min-
utes. A reason for that could be that people see these
errors to be too insignificant to react on, or that they
are unaware of the falsity because they do not know
the exact truth. Another important observation is that
people are more likely to react to authoritative and
popular Twitter users. A lot of unknown users could
spread false facts without getting a reaction from their
small group of followers. In contrast to the popular
and authoritative users which, in the eyes of their fol-
lower base, should be right. If they are incorrect, they
have a lot of users which potentially could react to an
error.
7.3 Performance in a Real-time
Situation
A very interesting scenario is how this prototype, if
minimally altered, would perform in a live situation.
In the thesis(Janssen, 2016), we describe this scenario
and (most interestingly) alter the reliability classifier
in such a way it will reevaluate its verdict over time.
More details can be found in the thesis.
8 CONCLUSION
Research on veracity in social media extremely im-
portant. By making use of this research, systems
can be designed which can serve as a tool to fil-
ter out misinformation in times of crisis, or serve as
filter applications for systems who make use of so-
cial media messages as a source of information. Re-
search relating to this is still very scarce, but recent
research done such as ClaimFinder(Cano et al., 2016)
and the Pheme project show the increasing interest in
this field. Due to the realization of impact of fake
news, society has currently pressured social media
websites to address this problem and multiple have
responded, for example Facebook has reported it will
use AI and user reports to counter the problem.(Tech
Crunch, 2016) Although we did not actively look into
the detection of fake news, our recommendation on
an approach would be to keep our architecture (and
some features) and add features related to the work of
Vakulenko et al. (Vakulenko et al., 2016).
In this paper, we have shown a system consist-
ing of four parts which are trained specifically on a
dataset containing Tweets about the World Cup. The
first component of the system is a filter which prevents
tweets from entering the rest of the system by mak-
ing use of a rule based classifier. From the original
64 million tweets, 3 million tweets are filtered. The
second component is the fact classifier, which is able
to recognize which types of facts the tweet contains.
This component is implemented by building a feature
based classifier using a J48 classifier. The third com-
ponent is the fact extractor, which is able to extract the
facts in the tweet. The main components are the en-
tity locators and extractors and the fact class specific
extractors which all use different strategies and tools
to extract their respective facts. The fourth and final
component of the system is the reliability classifier; a
feature based classifier which can determine if a tweet
contains a false fact. The classifier is implemented
by using features which determine the popularity and
reach of the facts in a Tweet as well as the number of
replies on a Tweet. The fact classifier scores an F1-
score of 0.96, the fact extractor an F1-score of 0.85
and the reliability classifier an F1-score on class A,
Tweets with zero false facts, of 0.988 and an F1-score
on class B, Tweets with 1 or more false facts, of 0.867.
As shown in various parts of the thesis, there is much
room for improvement, especially an improved entity
extraction can give the recall of several systems a big
performance increase.
REFERENCES
Bram Koster, M. (2014). Journalisten: social media niet
betrouwbaar, wel belangrijk #sming14.
Cano, A. E., Preotiuc-Pietro, D., Radovanovi
´
c, D., Weller,
WEBIST 2017 - 13th International Conference on Web Information Systems and Technologies
194