TweetPos: A Tool to Study the Geographic Evolution of Twitter Topics

Maarten Wijnants, Adam Blazejczak, Peter Quax and Wim Lamotte

Hasselt University - tUL - iMinds, Expertise Centre for Digital Media,

Wetenschapspark 2, 3590 Diepenbeek, Belgium

Keywords:

Twitter, Social Networking Sites (SNSs), Social Media, TweetPos, Geographic Trends, Investigative Tool.

Abstract:

Popular Social Networking Sites (SNSs) like Twitter and Facebook are evolving into crowd-sourced, inter-

disciplinary sensor systems that “monitor” a wide spectrum of (physical) properties and topics. This paper

introduces TweetPos, a web service that is intended to facilitate the analytical study of geographic tendencies

in Twitter data feeds. To oblige the human cognitive features, the TweetPos tool maximally relies on visual

data structures like heatmaps and charts to represent the geo-spatial sources of tweets. The tool compiles data

bodies that grant insight in both past and present tweet posting behavior, incorporates an animation engine

to highlight temporal trends, and leverages layered visualization techniques so that multiple topics can be

offset against each other, all from a geographic perspective. Via the presentation of two representative use

cases, we comprehensively demonstrate TweetPos’ data mining and analytical features and we illustrate the

(geo-spatial) intelligence they can amount to. Thanks to a generic implementation, the TweetPos service is

not geared towards a speciﬁc target audience but instead is sufﬁciently versatile to be valuable for a vast and

varied collection of consumer proﬁles like social scientists and market analysts.

1 INTRODUCTION

Social Networking Sites (SNSs) were conceived as a

means to virtually connect users and to offer them an

intuitive forum to ubiquitously contribute and dissem-

inate information in real time. As their number of sub-

scribers rose over time, so did the amount of content

that is managed by SNSs. As a result, they nowadays

host a wealth of user-generated data that is highly het-

erogeneous in nature.

Over the years, SNSs have also evolved

functionality-wise. While many such services were

purely text-based upon their inception, they nowadays

typically grant users the option to attach multimedia

items like pictures and video clips to their contribu-

tions. Another feature that has become nearly com-

monplace in the SNS landscape, is geotagging (i.e.,

attaching geographic coordinates as metadata to mes-

sages). It is apparent that such novel facilities embel-

lish the core SNS content and further extend its value.

Given their popularity and broad adoption, it is

becoming evermore valid to regard SNSs as real-life,

real-time and crowd-sourced sensor systems that gen-

erate valuable, highly heterogeneous data feeds (see,

for example, (Sakaki et al., 2010)). Stated differently,

popular present-day SNSs are rapidly transforming

into representative data providers. By intelligently

exploiting the data feeds that can be accumulated

from them, innovative and value-added services can

be conceived. In addition, mining and analyzing the

information that is shared by end-users through social

media can lead to valuable insights and knowledge.

Possible application domains include consumer be-

havior modeling, consumer proﬁling, intelligent rec-

ommendation systems, and population sentiment as-

sessment. Extracting such kinds of intelligence from

SNSs however typically requires external tools, as

profound mining and analysis mechanisms by default

are lacking from their feature set.

In this paper, we tend to Twitter, the authoritative

microblogging platform in the western world, and we

focus on investigating the data that is hosted by this

SNS from a geo-spatial perspective. In particular, we

introduce the web-based TweetPos tool, a convenient

means to display and study the geographic origin of

tweets, and to uncover the geographical evolution of

the popularity of tweet topics. A hybrid visualiza-

tion method encompassing both heatmap- and chart-

based data representation allows for thorough analy-

sis and mining with regard to the geo-spatial distribu-

tion of tweeted material over time. The TweetPos web

service affords keyword-based topic selection and in-

cludes a layering system that allows for easy compar-

ison of the geographical trends of multiple subjects.

257

Wijnants M., Blazejczak A., Quax P. and Lamotte W..

TweetPos: A Tool to Study the Geographic Evolution of Twitter Topics.

DOI: 10.5220/0004943502570266

In Proceedings of the 10th International Conference on Web Information Systems and Technologies (WEBIST-2014), pages 257-266

ISBN: 978-989-758-023-9

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

Furthermore, our tool is able to compile data sets that

integrate a representative sample of tweets from the

recent past with present-day tweet messages that are

captured in real time, in order to grant insight in both

historical and current tweet posting behavior. Finally,

the accumulated data collections can be aggregated

and studied on either a per-day or per-hour basis to

provide some degree of analytical granularity. We ar-

gue that, combined, these features offer all necessary

measures to perform signiﬁcant research about the

geographical sources of Twitter data. We will back

this claim by presenting the results of two prototypi-

cal analyses that illustrate the versatility, effectiveness

and comprehensiveness of the proposed instrument.

At the same time, the provided demonstrations serve

as prove of the extensive applicability of TweetPos:

courtesy of its generic methodology, it may one way

or another cater to the demands of a variety of hu-

man investigators, including social researchers, mar-

keteers, analysts and journalists.

A primordial aspect of the TweetPos solution is

its emphasis on providing graphical representations of

the crawled Twitter data. Contrary to computers, the

typical human mind does not excel at handling large

quantities of raw data. On the other hand, our cog-

nitive features make us more adept than computers at

interpreting visual data structures (Pinto et al., 2010)

like heatmaps and charts, which are exactly the output

modalities that are supported by our platform. The

TweetPos tool is hence intended to offer human op-

erators an adequate graphical workspace that allows

them to readily and conveniently assess geo-spatial

trends in social media contributions.

The remainder of this article is organized as fol-

lows. Section 2 presents an overview of the functional

features of the TweetPos web service. Next, Section 3

handles the architectural design and implementation

of the tool. We then evaluate our work in Section 4

by discussing some representative examples of inves-

tigations into the geographical evolution of recently

trending Twitter themes that have been produced with

the proposed tool. Section 5 brieﬂy reviews related

work on the analysis and mining of information that

has been shared via social networks, and at the same

time highlights our scientiﬁc contributions. Finally,

we draw our conclusions and suggest potential future

research directions in Section 6.

2 TweetPos

The TweetPos instrument is implemented as a web

service that is accessible via a standard web browser.

Screenshots of the tool’s input widgets are bundled

in Figure 1. As these images illustrate, keywords

or so-called Twitter hashtags are the service’s essen-

tial ingress parameters. Based on the speciﬁed topic

of interest, the tool will compile a corpus of tweets

that deal with this subject. This corpus will encom-

pass a representative sample of historical messages as

well as a completely accurate set of current and future

tweets on the topic at hand. The user is hereby granted

the option to apply geographical ﬁltering by limit-

ing the tweet compilation to either Europe or North

America, if so desired (see Figure 1(b)). An identi-

cal ﬁltering option is included in the input pane that

controls the visualization of the accumulated data (see

Figure 1(c)). Finally, a number of standard HTML in-

put elements allow for controlling the temporal con-

straints and the animation of the result set. In par-

ticular, via two HTML sliders and a checkbox, users

can enforce the discrete time interval with which (the

timestamps of) gathered tweets need to comply for

them to be included in the output. Two ﬁxed lev-

els of granularity are supported for the speciﬁcation

of the temporal constraints, which cause TweetPos

to aggregate ﬁltered tweets per hour and per day, re-

spectively. An animation engine that utilizes either

hourly or daily increments allows for the animated,

video-like presentation of the tweet data set and as

such might yield valuable insights into the geo-spatial

trends that are exhibited by tweet topics over time.

On the output front, the principal GUI element

consists of a topographic map that scaffolds heatmap-

based visualization of the geo-spatial provenances of

ﬁltered Twitter messages. Stated differently, this out-

put component displays the intensity, from a geo-

graphic point of view, of tweets that encompass the

speciﬁed input keyword. Besides a map, two addi-

tional output widgets are included in the tool. The ﬁrst

is a line chart that visualizes the quantitative volume

of the compiled tweet archive, aggregated either on

a per-hour or a per-day basis, while the second enu-

merates the textual contents of the collected tweets.

Figure 2 illustrates the TweetPos output interface.

An important feature of TweetPos is its keyword

layering functionality. The tool allows multiple key-

word ﬁlters to be active simultaneously, by conceptu-

ally associating (the results of) each concurrent hash-

tag search with an individual layer. Figures 2(a)

and 2(b) for instance illustrate a setup in which two

queries are involved. Layers are rendered on top of

the topographic map as uniquely colored overlays,

whose visualization can be independently toggled on

and off. Analogously, distinct tweet volumes are plot-

ted in the line graph for each currently deployed key-

word ﬁlter. A layer can be eliminated from the visu-

alization process via the legend that is incorporated in

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

258

(a) Topic selection.

(b) Geographical ﬁltering.

(d) Temporal constraints speciﬁcation and animation control.

Figure 1: TweetPos input GUI.

the geographic map. The layering system provides a

powerful means to investigate (the geo-spatial evolu-

tion of) multiple subjects concurrently, to offset them

against each other, to reveal potential correlations be-

tween them, and so on.

Apart from temporal ﬁltering parameters, the

TweetPos service also supports the speciﬁcation of

spatial constraints. This type of constraint is deployed

by clicking on the topographic map, which causes a

circular area to be drawn around the selected loca-

tion (see Figure 2(a)). The map’s zoom level and

the stretch of the marked geographical region have

been designed to be inversely proportional properties,

which implies that the spatial extent of the highlighted

area is controllable by zooming the map in and out. In

effect, installing a spatial constraint under a relatively

high zoom level will result in the selection of a rel-

atively tight geographical region, while the opposite

holds true when the map is heavily zoomed out.

All output components are dynamic, in the sense

that their content is updated on-the-ﬂy when the user

modiﬁes one or more input parameters. Obviously

this applies to the keywords or hashtags that are

searched for. In particular, initiating a new search op-

eration causes an additional layer to be introduced in

(a) Heatmap-based output of tweet locations on a topographic map

(including a spatial constraint speciﬁcation).

(b) Tweet volume presentation as a line diagram.

Figure 2: TweetPos output GUI.

both the 2D map and the line chart. Responding to

less profound input settings however also occurs in

real time. For example, exploiting the HTML slid-

ers to modify the time constraints causes the map, the

line chart as well as the list of tweet message to be

updated instantaneously. The map will be adjusted

to draw the geographic intensity that applied at the

speciﬁed timestamp, the volume plot will be updated

so that it correctly marks the currently selected time,

and the textual list will only display tweet messages

that satisfy the installed temporal restrictions. Analo-

gous actions are dynamically undertaken in reaction

to the deﬁnition of a spatial constraint. More pre-

cisely, the volume plot and textual message list only

TweetPos:ATooltoStudytheGeographicEvolutionofTwitterTopics

259

Figure 3: High-level system architecture.

reckon with tweets that originated from the desig-

nated spatial area, if any. This feature allows human

operators to zoom in on certain geographic regions

and to perform ﬁne-grained, localized analyses. As

a ﬁnal example of the dynamism of the output GUI,

switching between layers via the legend in the topo-

graphic map causes the contents of the textual tweet

enumeration widget to be updated so that it only dis-

plays those messages that apply to the keyword that

corresponds with the currently selected layer.

3 IMPLEMENTATION

The TweetPos implementation is completely web-

compliant. HTML and CSS are used for rendering the

GUI and for handling page layout and style, while all

programmatic logic is scripted in PHP and JavaScript

(at server and client side, respectively).

Our motivations for realizing the TweetPos appli-

cation as a web service are manifold. First of all,

selecting the web as deployment platform acknowl-

edges the pervasiveness of the Internet in modern so-

ciety. At the same time, it renders the TweetPos func-

tionality available on all environments and devices

that support widespread and standardized web tech-

nologies, which maximizes the portability of our im-

plementation. Finally, numerous utility libraries and

supportive tools exist for the web, which we have

gladly leveraged to expedite the development process.

3.1 Architectural Design

A schematic overview of TweetPos’ architectural

setup is given in Figure 3. TweetPos adopts a

client/server network topology. The back-end HTTP

server forms the heart of the system; it interfaces

with Twitter, implements the data ﬁltering and com-

pilation, hosts a relational database (RDBMS) for

data persistence purposes, and responds to incom-

ing HTTP requests. The client on the other hand is

very lightweight, as its responsibilities are limited to

user interfacing and data visualization. As such, the

server (and the RDBMS which it encapsulates) forms

a level of abstraction in the TweetPos system archi-

tecture between respectively the external information

source (i.e., Twitter) and the client-side presentation

of the disclosed data.

3.2 Twitter Data Collection

Twitter provides multiple HTTP-based APIs to en-

able third-party software developers to interface with

the platform and to build socially-inspired applica-

tions. The TweetPos tool exploits two of these APIs

in order to harvest both historical and up-to-date (pub-

lic) Twitter data. First of all, the Twitter Search API

(which is embedded in the Twitter REST API as of

version 1.1) is leveraged to compose a non-exhaustive

yet representative sample of tweets from the past 7

days that dealt with a particular subject. The quanti-

tative incompleteness is intrinsic to Twitter and rep-

resents a deliberate strategy in the platform’s design

(Twitter Developers, 2013). In effect, the Search API

has been designed for relevance and not complete-

ness, which implies that it is not intended to deliver

a rigorous index of past tweets. The second Twitter

interface that fuels TweetPos’ data collection proce-

dure is a low-latency gateway to the global stream of

tweets, called the Streaming API. This particular API

allows developers to set up a long-lived HTTP con-

nection to the Twitter back ofﬁce, over which tweets

from that moment on will then be streamed incre-

mentally. In combination with extensive ﬁltering and

querying mechanisms, applications in this way ob-

tain near-real-time and exhaustive access to exactly

the type of tweets they are interested in. To facilitate

the interaction with the Twitter Streaming API, the

TweetPos tool integrates the 140dev Streaming API

framework (140dev, 2013).

For the sake of comprehensiveness, we will now

describe the complete set of actions and operations

that constitute TweetPos’ data ingestion pipeline.

When a user initiates a new data collection process

by transmitting a keyword-based query to the Tweet-

Pos server, the latter will spawn a total of seven PHP

daemons. Each of these background processes utilize

the Twitter Search API to jointly compile a pool of

relevant historical tweets that were contributed dur-

ing the past week (i.e., one process per day). At

the same time, the back-end server manages a (PHP-

based) daemon that permanently monitors the Twitter

Streaming API. As an end-point is only allowed to

set up a single connection to the Streaming API, this

background process runs a cumulative ﬁlter to guaran-

tee that all present and future tweets that satisfy one

of the currently active queries are captured. In con-

trast to the Search API daemons, which have a ﬁnite

execution time and are query-speciﬁc, the Streaming

API process runs indeﬁnitely and is shared by queries.

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

260

A dedicated widget in the client-side GUI empowers

users to stop the real-time monitoring of a particular

topic (which is enforced by updating the cumulative

ﬁlter of the Streaming API daemon).

3.3 Data Storage and Processing

Fetched tweets are persisted at server side in a

MySQL database. To streamline the integration of

the 140dev framework in the TweetPos tool, we

have opted to integrally adopt its cache architecture

and accompanying database schema. The caching

mechanism of the 140dev framework applies a two-

step approach. An aggregation step continuously ﬁl-

ters JSON-encoded tweet data (including the actual

message and all sorts of metadata) from the Twit-

ter Streaming API and inserts the resulting data di-

rectly into a designated caching table in the back-end

database. In effect, this task is fulﬁlled by the Stream-

ing API daemon that was mentioned in Section 3.2.

Simultaneously, an independent background process

successively pulls single raw JSON items from this

table, parses and conveniently formats the composing

entities of the corresponding tweets (i.e., the textual

message itself, the encapsulated hashtags and men-

tions, etcetera), and distributes the outcome across

dedicated database tables. By isolating the aggre-

gation from the parsing of relevant tweets, real-time

and lossless data ingestion is guaranteed (the Twitter

Streaming API might yield tremendous quantities of

data, whose sheer volume might prohibit on-the-ﬂy

parsing and processing).

Besides leveraging the 140dev caching methodol-

ogy and database schema for the Streaming API con-

text of the TweetPos tool, we have decided to extend

their application to the Twitter Search API component

of our implementation. This entails that historical

tweets that are harvested by the Search API daemons

are just as well cached in raw JSON format and then

parsed by the same process that also handles Stream-

ing API contributions. The beneﬁcial implications of

this design are that it yields a clean software architec-

ture, ensures uniform treatment of tweets originating

from heterogeneous sources, and enables the elimina-

tion of data duplication in an integrated manner (i.e.,

without requiring an exogenous control loop).

Once the data collection procedure for a particu-

lar keyword-based query has been initiated, all client

requests that are related to this query are handled at

server side by means of pure RDBMS interactions.

As an example, the execution of adequate SQL state-

ments sufﬁces for the server to be able to forward an

up-to-date overview of Twitter data pertaining to the

queried topic to the client.

3.4 Geocoding

As the TweetPos tool is chieﬂy concerned with the

geo-spatial provenance of tweets, it is clear that geo-

graphic metadata plays a primordial role in its opera-

tion. To be more precise, geographic coordinates are

needed in order to pinpoint a tweet on a topographic

map. Some Twitter users include these coordinates di-

rectly in their posts (e.g., users with smartphones with

built-in GPS receivers), yet the majority only inserts

a descriptive representation of the involved location

(e.g., in the form of a textual address), or even leave

out all geographic references altogether.

TweetPos’ data accumulation procedure is agnos-

tic of the presence of geo-spatial metadata in tweets.

Stated differently, tweets that lack any trace of ge-

ographical metadata are not ﬁltered out by either

the Streaming API or Search API data compiler.

Tweets holding exact geographic footprints are di-

rectly cached, as they can be readily localized on a

map. In case the tweet only incorporates a descrip-

tive geo-spatial reference, the data processing dae-

mon described in Section 3.3 will invoke the Google

Geocoding API (Google Developers, 2013b) to trans-

late the description into geographic coordinates prior

to database insertion. Finally, although non-localized

contributions are not exploitable in the current imple-

mentation, they are still recorded in the database “as

is” for the sake of completeness (i.e., they may hold

some value in future extensions of the tool).

3.5 Visualization

All visualization and GUI interaction operations are

performed at client side by means of HTML and

JavaScript.

3.5.1 Heatmap-based Geolocation Clustering

The topographic output map has been implemented

by means of the JavaScript variant of the Google

Maps API (Google Developers, 2013a). Tweets are

positioned on this map on the basis of the geo-

graphic location from which they were posted. In-

stead of marking (the location of) individual tweets on

the map, a heatmap-based design has been adopted.

Heatmaps are a general-purpose data visualization

technique in which the intensity of data points is plot-

ted in relative comparison to the absolute maximum

value of the data set. Typically, data point inten-

sity is indicated by means of a color coding scheme.

Compared to mashups of discrete markers (which

might easily clutter the map in the case of voluminous

data sets), heatmaps hold the perceptual advantage

TweetPos:ATooltoStudytheGeographicEvolutionofTwitterTopics

261

that, without sacriﬁcing much detail, they are natu-

rally surveyable and interpretable. The Google Maps

JavaScript API has built-in support for heatmap ren-

dering.

3.5.2 Line Graph

While the heatmap at a glance provides users with

an impression of the spatial characteristics of a par-

ticular Twitter topic, it fails to communicate exact

quantitative ﬁgures concerning the tweet volume. To

counter this deﬁciency, the TweetPos tool includes a

line graph visualization that discretely plots, either

per hour or per day, the number of tweets that address

the queried subject(s). As such, it visualizes a pre-

cise overview of the temporal evolution of the pop-

ularity of themes (expressed in tweet quantity). The

line diagram is implemented via jqPlot, a plotting and

charting plug-in for the jQuery JavaScript framework

(http://www.jqplot.com/). The data values that com-

pose the graph are interactive in the sense that they

can be clicked to leap the date selection sliders (see

Figure 1(d)) to the corresponding timestamp.

3.5.3 Tweet Message Enumeration

The TweetPos tool is also able to output the textual

contents of ﬁltered tweets. This output method has

been realized by means of the MegaList jQuery plug-

in (http://triceam.github.io/MegaList/). Like the other

output widgets, it is adaptive in the sense that it dy-

namically adjusts its contents to imposed spatiotem-

poral constraints. This widget is intended to provide

users insight into the context in which the queried

topic is referenced. As such, it allows for accurate,

context-aware classiﬁcation of tweets based on the

messages they carry. For instance, a tweet about a

certain incident might plead for or, conversely, against

it; by inspecting the textual context, the stance of the

tweet publisher becomes apparent.

4 EVALUATION

This section serves to showcase the capabilities of the

TweetPos instrument by presenting two representative

examples of (geo-spatial) analyses of Twitter content

that have been produced with it. The ﬁrst test case is

intended to rigorously demonstrate TweetPos’ overall

practicalities and to generally exemplify the data min-

ing options which the tool scaffolds, while the second

example focuses on TweetPos’ layering functionality

and the analytical features it entails. Space limitations

force us to be brief in our discussion, and prevent us

from including additional demonstrations.

4.1 2014 FIFA World Cup Qualiﬁers

The ﬁnal two qualiﬁer matches for next year’s soccer

World Cup were played on October 11th and 15th,

2013, respectively. We have exploited the Tweet-

Pos service to investigate the (geographic) resonance

of these matches on Twitter, speciﬁcally for Bel-

gium’s national soccer team (which are nicknamed

the “Red Devils” or “Rode Duivels” in Dutch). We

issued a TweetPos data collection request for the

RodeDuivels hashtag on October 13th and kept this

query active until October 19th. Figure 4 shows the

geographic distribution of the tweets that were gath-

ered worldwide in the one hour interval immediately

succeeding the end of the two matches, as well as a

chart-based representation of the tweet quantity that

was harvested during the entire course of the experi-

ment (aggregated per hour). As the query was initi-

ated on October 13th, all tweet data in the result set

that precedes this date was acquired via the Search

API, while tweets with an older timestamp were ﬁl-

tered from the Streaming API.

Analysis of the experimental results yields four

notable observations. First and foremost, the out-

put graph reveals two obvious peaks in tweet vol-

ume. These local maxima coincide nicely with the

Red Devils’ schedule of play. As such, this test

case corroborates Twitter’s capacity to act as a user-

driven distributed sensor system that is able to iden-

tify real-world events (see also Section 5). As the data

collection procedure was started in between the two

matches, this capacity applies to both the Search API

(for events from the recent past) and Streaming API

(for current and future events). Secondly, tweets deal-

ing with the match on October 11th appear to have

originated practically exclusively from Belgium and

its surrounding countries. In contrast, tweets about

the second game exhibit a quasi worldwide distribu-

tion, yet again with a strong concentration in West-

ern Europe. As the ﬁrst set of tweets was ingested

via the Twitter Search API, this outcome can likely

be attributed to the operational principles of this in-

terface (recall from Section 3.2 that the Search API

aims for relevance, not comprehensiveness). Thirdly,

although their volume is rather marginal, tweets em-

bodying the RodeDuivels keyword were found to

also emerge from non-Dutch speaking countries like

the USA, Spain and Turkey (see the rightmost topo-

graphic map in Figure 4). After inspecting the tex-

tual contents of these contributions (by means of the

tweet message enumeration widget described in Sec-

tion 3.5.3), it became clear that these types of tweets

can roughly be classiﬁed into two categories:

• tweets written in Dutch by Belgian citizens (tem-

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

262

Figure 4: Results of the 2014 FIFA World Cup qualiﬁers experiment.

porarily) living abroad; e.g., “Come on #Rode-

Duivels, I am rooting for you from my hotel room

in Barcelona!” (English translation)

• retweets by the local population of English mes-

sages that include the (Dutch) RodeDuivels

hashtag; often, the original messages were posted

by Dutch natives who wanted to reach an internal

audience; e.g., “Belgium versus Wales qualiﬁer

starting in 15 minutes #RodeDuivels #RedDevils

#belwal #wc2014”

The fourth and ﬁnal observation pertains to location-

driven personalization of the tweeted contents. For

example, a tweet by Toby Alderweireld (a Bel-

gian soccer player who plays for Atletico Madrid in

Spain), written in English and communicating Bel-

gium’s qualiﬁcation for next year’s World Cup, was

actively retweeted by his followers in Spain and

amounted to the majority of RodeDuivels tweets

that originated from that country. A single Spanish

Atletico Madrid fan mentioned not only Toby Alder-

weireld but also his Belgian teammate Thibaut Cour-

tois in his tweet: “Well done to #Atleti’s @thibaut-

courtois & @AlderweireldTob and their #RodeDuiv-

els teammates. We’ll see you in Brazil at #wc2014”.

4.2 Game Console Comparison

The market of (next-gen) gaming consoles is (for the

time being) dominated by Sony, Microsoft and Nin-

tendo with their PlayStation 4, Xbox One and Wii U

hardware, respectively. In this second test case, the

TweetPos tool was put to use to compare the atten-

tion these three consoles receive on the Twitter net-

work, and to uncover geographic dissimilarities be-

tween their respective popularity, if any. Therefore,

between November 1st and November 16th, 2013, the

ps4, xboxone and WiiU keywords were tracked with

TweetPos. An impression of the resulting data set

is given in Figure 5. This ﬁgure visualizes the geo-

spatial intensities of the three hashtags on the launch

day of the PlayStation 4 in the USA (i.e., on Novem-

ber 15th between 07:00h and 08:00h UTC-5), as well

as per-hour aggregated overviews of the volumetric

magnitudes of the collected data sets.

These experimental results validate that TweetPos

succeeds in layering multiple heatmaps, each associ-

ated with an independent query, on top of a single to-

pographic map. The same holds true for the tweet

volume plotting functionality of the line chart. Notice

however from the topmost row of images in Figure

5 that keyword visualizations might quickly conceal

one another in multi-layer scenarios, which in turn

is likely to impair analytical efﬁciency. Courtesy of

TweetPos’ ability to on-the-ﬂy switch the rendering

of individual layers on and off, it nonetheless remains

feasible to interactively compare and interpret (the ge-

ographic provenance of) tweets in multi-query stud-

ies. In effect, the images in the bottom three rows in

Figure 5 communicate exactly the same information

as the ones in the upper row, yet in an itemized fash-

ion.

In-depth analysis of the composed data body falls

beyond the scope of this article. Instead, we will point

out two illustrative insights that we were able to ex-

TweetPos:ATooltoStudytheGeographicEvolutionofTwitterTopics

263

Figure 5: Heatmap-based as well as quantitative comparison of game console popularity.

tract from the collected tweets. Firstly, Figure 5 at a

glance reveals the existence of large quantitative dif-

ferences between the three tracked keywords. In the

monitored time interval, the Wii U console garnered

only a fraction of the attention that the Xbox One was

able to accumulate, whose Twitter coverage in turn

was outclassed by that of the PlayStation 4 by an or-

der of magnitude. The fact that the experiment en-

capsulated the PlayStation 4’s USA release date deﬁ-

nitely contributed to this outcome. In particular, in-

spection of the captured tweet messages conﬁrmed

considerable hype build-up as the PlayStation 4 re-

lease approached. For the same reason, the PlaySta-

tion 4 tweets geo-spatially tended towards the USA.

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

264

Secondly, the volume diagrams show that Microsoft

was able to pierce the PlayStation 4’s Twitter hege-

mony exactly once in the course of the experiment.

This achievement can be attributed to a clever mar-

keting strategy: by retweeting a message from the of-

ﬁcial Twitter account of Xbox France, users could re-

veal the identity of the French Xbox One ambassador,

an opportunity that was massively seized by fans. The

resulting retweets primarily originated from Western

Europe, and France in particular (not shown in Figure

5).

5 RELATED WORK

The principle of creating map mashups of the geo-

graphic sources of tweets has been considered by a

number of commercialized web services. Examples

include TweepsMap (http://tweepsmap.com/),

Trendsmap (http://trendsmap.com/), Twee-

real (http://tweereal.com/), Tweetping

(http://tweetping.net/) and GlobalTweets

(http://globaltweets.com/). The ﬁrst maps (the

home location of) the followers of a particular user’s

Twitter account, the second provides a real-time,

localized mashup of currently trending Twitter

themes, and the ﬁnal three offer real-time geographic

visualization of Twitter posts.

The academic literature also holds a number of ar-

ticles that deal with deriving geo-spatial insights from

Twitter data. Stefanidis et al. have proposed a frame-

work to harvest and analyze ambient geographic in-

formation (i.e., not speciﬁed in terms of explicit co-

ordinates) from tweets (Stefanidis et al., 2013). The

iScience Maps tool targets behavioral researchers in-

terested in exploiting Twitter for localized social me-

dia analysis purposes (Reips and Garaizar, 2011). The

global concept of applying Twitter as a distributed

sensor network to identify and locate events in the

physical world has been successfully explored by

a number of analogous research initiatives (Sakaki

et al., 2010; Boettcher and Lee, 2012; Crooks et al.,

2013; Takahashi et al., 2011); of particular relevance

is the social pixel/images/video approach by Singh

et al. that allows for Twitter-powered situation de-

tection and spatio-temporal assessments (Singh et al.,

2010). Field and O’Brien have investigated the appli-

cation of cartographic principles to Twitter-powered

map mashups (Field and O’Brien, 2010). Finally, the

software architecture proposed by Oussalah et al. af-

fords the deployment of geolocated services that are

fueled by Twitter data (Oussalah et al., 2013).

All systems that have been cited in this section,

both commercialized and academic ones, have their

speciﬁc merits and feature sets. The TweetPos in-

strument exhibits functional overlaps with all of them.

For example, the social pixel approach largely corre-

sponds with our animated heatmap-based visualiza-

tion solution. Some related tools even provide func-

tionality that is missing in TweetPos. When for in-

stance again looking at the social pixel framework, it

incorporates an automated situation detection scheme

and exploits domain semantics to autonomously rec-

ommend relevant control actions in response to de-

tected events. However, the TweetPos tool exceeds

every cited initiative in terms of the variety of analyti-

cal means it integrates and the synergistic beneﬁts that

stem from this holistic design. As an example, only a

minority of the related systems grants insight in both

historical and current tweet posting behavior. Also,

the combination of a heatmap-based representation of

the geographic intensity of topics, a tweet volume di-

agram, and dynamic means to inspect the textual con-

tents of tweets fosters unprecedented deep mining of

(the geo-spatial evolution of) Twitter contributions. A

ﬁnal example of a differentiating TweetPos feature is

its layering mechanism and the opportunities in terms

of comparative analysis it unlocks. Only the iScience

Maps tool provides similar functionality, yet its com-

parison options are limited to exactly two conﬁgura-

tions; in contrast, unlimited numbers of layers can be

constructed in TweetPos.

6 CONCLUSIONS AND FUTURE

WORK

SNSs have become prominent information channels

in present-day society, as is manifested by the massive

amounts of information that are shared and commu-

nicated through them. Given this quantitative over-

load, human operators beneﬁt from tools that assist

in transforming the constituting raw data into prac-

tical knowledge. This article has proposed Tweet-

Pos, a web service that provides exactly such assis-

tive functions for the Twitter network, hereby allocat-

ing elevated attention to the geo-spatial characteris-

tics of tweets. As the human mind is very adept at

visual pattern recognition and at interpreting graphi-

cal data formats, TweetPos maximally invests in vi-

sual output modalities. The tool integrates and blends

multiple complementary functions in order to yield

a holistic solution for Twitter data analysis. Experi-

mental results collected from two isolated test cases

conﬁrm this claim and prove the feasibility, effective-

ness and added value of our work. In particular, it

has been established that the TweetPos service suc-

ceeds in streamlining the ingestion, ﬁltering, process-

TweetPos:ATooltoStudytheGeographicEvolutionofTwitterTopics

265

ing, analysis and mining of tweeted information, and

as such represents a valuable, highly versatile tool

with cross-disciplinary application options.

Decision making logic, provisions for automated

conclusion drawing and autonomous recommenda-

tion systems have deliberately been omitted from the

current instantiation of the proposed tool, as we be-

lieve these tasks are more suited to human operators

than to machines. As part of future research, we

nonetheless plan to investigate whether the incorpora-

tion of computer-mediated aids might assist users in

executing these actions more efﬁciently and swiftly.

Potential supportive technologies include visual pat-

tern recognition and edge detection algorithms to fa-

cilitate heatmap analysis, and linguistic processing

frameworks to aid human operators in categorizing

aggregated tweets on the basis of the textual message

they convey. Another trajectory of future work is dy-

namic data delivery. In the current implementation,

all tweet data pertaining to a particular query is trans-

ferred from the back-end server to the web browser in

bulk. Although this design renders the TweetPos ser-

vice highly responsive once all data has been down-

loaded, it also causes start-up delays to be high (i.e.,

they are directly proportional to the data set size).

At the same time, network bandwidth utilization is

suboptimal, as the client is likely to end up down-

loading data which the user will never inspect (or at

least not in detail). We will therefore implement a

demand-oriented transmission scheme in which rele-

vant data is transmitted just-in-time (i.e., when it be-

comes needed). By doing so, we will be able to in-

vestigate the trade-off between service responsiveness

and start-up delay, as well as the impact this balance

has on the usage experience.

REFERENCES

140dev (2013). 140dev Streaming API Framework. On-

line, http://140dev.com/free-twitter-api-source-code-

library/.

Boettcher, A. and Lee, D. (2012). EventRadar: A Real-

Time Local Event Detection Scheme Using Twitter

Stream. In Proc. GreenCom 2012, pages 358–367,

Besanc¸on, France.

Crooks, A., Croitoru, A., Stefanidis, A., and Radzikowski,

J. (2013). #Earthquake: Twitter as a Distributed Sen-

sor System. Transactions in GIS, 17(1):124–147.

Field, K. and O’Brien, J. (2010). Cartoblography: Experi-

ments in Using and Organising the Spatial Context of

Micro-blogging. Transactions in GIS, 14(s1):5–23.

Google Developers (2013a). Google Maps

JavaScript API v3. Online, https://developers

.google.com/maps/documentation/javascript/.

Google Developers (2013b). The Google

Geocoding API. Online, https://developers.

google.com/maps/documentation/geocoding/.

Oussalah, M., Bhat, F., Challis, K., and Schnier, T.

(2013). A Software Architecture for Twitter Collec-

tion, Search and Geolocation Services. Knowledge-

Based Systems, 37:105–120.

Pinto, N., Majaj, N. J., Barhomi, Y., Solomon, E. A., Cox,

D. D., and DiCarlo, J. J. (2010). Human versus Ma-

chine: Comparing Visual Object Recognition Systems

on a Level Playing Field. In Proc. Cosyne 2010, Salt

Lake City, UT, USA.

Reips, U.-D. and Garaizar, P. (2011). Mining Twitter: A

Source for Psychological Wisdom of the Crowds. Be-

havior Research Methods, 43(3):635–642.

Sakaki, T., Okazaki, M., and Matsuo, Y. (2010). Earthquake

Shakes Twitter Users: Real-time Event Detection by

Social Sensors. In Proc. WWW 2010, pages 851–860,

Raleigh, NC, USA.

Singh, V. K., Gao, M., and Jain, R. (2010). Situation Detec-

tion and Control Using Spatio-temporal Analysis of

Microblogs. In Proc. WWW 2010, pages 1181–1182,

Raleigh, NC, USA.

Stefanidis, A., Crooks, A., and Radzikowski, J. (2013). Har-

vesting Ambient Geospatial Information from Social

Media Feeds. GeoJournal, 78(2):319–338.

Takahashi, T., Abe, S., and Igata, N. (2011). Can Twitter

Be an Alternative of Real-world Sensors? In Proc.

HCI International 2011, pages 240–249, Orlando, FL,

USA.

Twitter Developers (2013). Using the Twitter Search API.

Online, https://dev.twitter.com/docs/using-search.

WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies

266