Furthermore, our tool is able to compile data sets that
integrate a representative sample of tweets from the
recent past with present-day tweet messages that are
captured in real time, in order to grant insight in both
historical and current tweet posting behavior. Finally,
the accumulated data collections can be aggregated
and studied on either a per-day or per-hour basis to
provide some degree of analytical granularity. We ar-
gue that, combined, these features offer all necessary
measures to perform significant research about the
geographical sources of Twitter data. We will back
this claim by presenting the results of two prototypi-
cal analyses that illustrate the versatility, effectiveness
and comprehensiveness of the proposed instrument.
At the same time, the provided demonstrations serve
as prove of the extensive applicability of TweetPos:
courtesy of its generic methodology, it may one way
or another cater to the demands of a variety of hu-
man investigators, including social researchers, mar-
keteers, analysts and journalists.
A primordial aspect of the TweetPos solution is
its emphasis on providing graphical representations of
the crawled Twitter data. Contrary to computers, the
typical human mind does not excel at handling large
quantities of raw data. On the other hand, our cog-
nitive features make us more adept than computers at
interpreting visual data structures (Pinto et al., 2010)
like heatmaps and charts, which are exactly the output
modalities that are supported by our platform. The
TweetPos tool is hence intended to offer human op-
erators an adequate graphical workspace that allows
them to readily and conveniently assess geo-spatial
trends in social media contributions.
The remainder of this article is organized as fol-
lows. Section 2 presents an overview of the functional
features of the TweetPos web service. Next, Section 3
handles the architectural design and implementation
of the tool. We then evaluate our work in Section 4
by discussing some representative examples of inves-
tigations into the geographical evolution of recently
trending Twitter themes that have been produced with
the proposed tool. Section 5 briefly reviews related
work on the analysis and mining of information that
has been shared via social networks, and at the same
time highlights our scientific contributions. Finally,
we draw our conclusions and suggest potential future
research directions in Section 6.
2 TweetPos
The TweetPos instrument is implemented as a web
service that is accessible via a standard web browser.
Screenshots of the tool’s input widgets are bundled
in Figure 1. As these images illustrate, keywords
or so-called Twitter hashtags are the service’s essen-
tial ingress parameters. Based on the specified topic
of interest, the tool will compile a corpus of tweets
that deal with this subject. This corpus will encom-
pass a representative sample of historical messages as
well as a completely accurate set of current and future
tweets on the topic at hand. The user is hereby granted
the option to apply geographical filtering by limit-
ing the tweet compilation to either Europe or North
America, if so desired (see Figure 1(b)). An identi-
cal filtering option is included in the input pane that
controls the visualization of the accumulated data (see
Figure 1(c)). Finally, a number of standard HTML in-
put elements allow for controlling the temporal con-
straints and the animation of the result set. In par-
ticular, via two HTML sliders and a checkbox, users
can enforce the discrete time interval with which (the
timestamps of) gathered tweets need to comply for
them to be included in the output. Two fixed lev-
els of granularity are supported for the specification
of the temporal constraints, which cause TweetPos
to aggregate filtered tweets per hour and per day, re-
spectively. An animation engine that utilizes either
hourly or daily increments allows for the animated,
video-like presentation of the tweet data set and as
such might yield valuable insights into the geo-spatial
trends that are exhibited by tweet topics over time.
On the output front, the principal GUI element
consists of a topographic map that scaffolds heatmap-
based visualization of the geo-spatial provenances of
filtered Twitter messages. Stated differently, this out-
put component displays the intensity, from a geo-
graphic point of view, of tweets that encompass the
specified input keyword. Besides a map, two addi-
tional output widgets are included in the tool. The first
is a line chart that visualizes the quantitative volume
of the compiled tweet archive, aggregated either on
a per-hour or a per-day basis, while the second enu-
merates the textual contents of the collected tweets.
Figure 2 illustrates the TweetPos output interface.
An important feature of TweetPos is its keyword
layering functionality. The tool allows multiple key-
word filters to be active simultaneously, by conceptu-
ally associating (the results of) each concurrent hash-
tag search with an individual layer. Figures 2(a)
and 2(b) for instance illustrate a setup in which two
queries are involved. Layers are rendered on top of
the topographic map as uniquely colored overlays,
whose visualization can be independently toggled on
and off. Analogously, distinct tweet volumes are plot-
ted in the line graph for each currently deployed key-
word filter. A layer can be eliminated from the visu-
alization process via the legend that is incorporated in
WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies
258