mainly to accommodate the Twitter Search API
restrictions. The Streaming API prototype us-
ing Docker/ELB has proven successful, the five
American cities can be ported to this architecture
with minimal difficulty.
7 CONCLUSIONS
This work has built on previous investigations, fur-
ther exploring temporal implications of population es-
timations from social media data. A new architecture
was deployed, new data from Lisbon, Portugal was
attained, and a modern bot detection algorithm was
explored. Using removal techniques from previous
work, experiments were run on different time peri-
ods, in multiple cities, to create a baseline minimum
amount of time that collection code would have to run
(6-8 weeks), before a population estimation with rea-
sonable confidence can be obtained. This is pertinent
when a new geographic area is being investigated, or
a new social media feed is being implemented for an
existing area. Having a minimum viable time period
can bring a greater confidence to the end user when
leveraging this method for population estimation.
REFERENCES
Abdi, H. and Williams, L. (2010). Normalizing data. En-
cyclopedia of research design. Sage, Thousand Oaks,
pages 935–938.
Amazon.com, I. (2017). Aws elastic beanstalk - deploy web
applications.
Anderson, W., Guikema, S., Zaitchik, B., and Pan, W.
(2014). Methods for estimating population density
in data-limited areas: Evaluating regression and tree-
based models in peru. PloS one, 9(7):e100037.
Aubrecht, C.,
¨
Ozceylan Aubrecht, D., Ungar, J., Freire, S.,
and Steinnocher, K. (2016). Vgdi–advancing the con-
cept: Volunteered geo-dynamic information and its
benefits for population dynamics modeling. Transac-
tions in GIS.
Batista e Silva, F., Poelman, H., Martens, V., and Lavalle,
C. (2013). Population estimation for the urban atlas
polygons. Joint Research Centre.
Caragea, C., Mcneese, N., Jaiswal, A., Traylor, G.,
Woo Hyun, K., Mitra, P., Wu, D., H Tapia, A., Giles,
L., Jansen, J., and Yen, J. (2011). Classifying text
messages for the haiti earthquake. 8th International
Conference on Information Systems for Crisis Re-
sponse and Management: From Early-Warning Sys-
tems to Preparedness and Training, ISCRAM 2011.
Chai, T. and Draxler, R. R. (2014). Root mean square er-
ror (rmse) or mean absolute error (mae)?–arguments
against avoiding rmse in the literature. Geoscientific
Model Development, 7(3):1247–1250.
Client, H. (2017). Github - twitter/hbc: A java http client
for consuming twitter’s streaming api.
Copernicus (2017). Urban atlas 2012 - copernicus land
monitoring service.
Dabove, P. and Manzino, A. M. (2014). Gps & glonass
mass-market receivers: positioning performances and
peculiarities. Sensors, 14(12):22159–22179.
Davis, C. A., Varol, O., Ferrara, E., Flammini, A., and
Menczer, F. (2016). Botornot: A system to evaluate
social bots. In Proceedings of the 25th International
Conference Companion on World Wide Web, pages
273–274. International World Wide Web Conferences
Steering Committee.
Doulamis, N. D., Doulamis, A. D., Kokkinos, P., and Var-
varigos, E. M. (2016). Event detection in twitter
microblogging. IEEE transactions on cybernetics,
46(12):2810–2824.
Eaton, J. W., Bateman, D., and Hauberg, S. (2007). GNU
Octave version 3.0. 1 manual: a high-level interactive
language for numerical computations. SoHo Books.
GitHub (2017). Github - iunetsci/botometer-python: A
python api for botometer by osome.
Holmes, S. (2000). Rms error.
Khan, S. F., Bergmann, N., Jurdak, R., Kusy, B., and
Cameron, M. (2017). Mobility in cities: Comparative
analysis of mobility models using geo-tagged tweets
in australia. In Big Data Analysis (ICBDA), 2017
IEEE 2nd International Conference on, pages 816–
822. IEEE.
Li, L., Goodchild, M. F., and Xu, B. (2013). Spatial, tem-
poral, and socioeconomic patterns in the use of twitter
and flickr. cartography and geographic information
science, 40(2):61–77.
Liang, Y., Caverlee, J., Cheng, Z., and Kamath, K. Y.
(2013). How big is the crowd?: event and location
based population modeling in social media. In Pro-
ceedings of the 24th ACM Conference on Hypertext
and Social Media, pages 99–108. ACM.
Lin, J. and Cromley, R. G. (2015). Evaluating geo-located
twitter data as a control layer for areal interpolation of
population. Applied Geography, 58:41–47.
Lindsay, B. R. (2011). Social media and disasters: Current
uses, future options, and policy considerations.
Mashape (2017). Botometer api documentation.
Poorthuis, A., Zook, M., Shelton, T., Graham, M., and
Stephens, M. (2014). Using geotagged digital social
data in geographic research.
Poushter, J. (2016). Smartphone ownership and internet us-
age continues to climb in emerging economies. Pew
Research Center, 22.
Rose, A. N. and Bright, E. A. (2014). The landscan global
population distribution project: current state of the
art and prospective innovation. Technical report, Oak
Ridge National Laboratory (ORNL).
See, L., Mooney, P., Foody, G., Bastin, L., Comber, A., Es-
tima, J., Fritz, S., Kerle, N., Jiang, B., Laakso, M.,
et al. (2016). Crowdsourcing, citizen science or vol-
unteered geographic information? the current state of
crowdsourced geographic information. ISPRS Inter-
national Journal of Geo-Information, 5(5):55.
GISTAM 2018 - 4th International Conference on Geographical Information Systems Theory, Applications and Management
146