bag-of-words logistic regression, yet it failed to out-
perform deep learning models.
Further, in the LOGO-CV setup, we observed
that removing groups with large numbers of docu-
ments such as Al-Boraq or Alarabiya significantly
boosted the predictive performance of the opposite
class. However, assuming that we will not know the
class label of the testing group, we cannot determine
which groups to exclude from the training. We plan
to extend this work to explore different ways to au-
tomatically select training data such as selecting the
top k similar documents for every testing document
or the top k groups with highest in-group similarity
variance. We would also like to implement different
data-driven ensemble models such as learning a new
Logistic regression that take the predicted probabili-
ties of the individual models as predictors.
REFERENCES
Boussidan, A. and Ploux, S. (2011). Using topic salience
and connotational drifts to detect candidates to seman-
tic change. In Proceedings of the Ninth International
Conference on Computational Semantics, pages 315–
319. Association for Computational Linguistics.
Cohen, T., Blatter, B., and Patel, V. (2005). Exploring dan-
gerous neighborhoods: latent semantic analysis and
computing beyond the bounds of the familiar. In
AMIA Annual Symposium Proceedings, volume 2005,
page 151. American Medical Informatics Association.
Glasgow, K. and Schouten, R. (2014). Assessing violence
risk in threatening communications. In Proceedings of
the Workshop on Computational Linguistics and Clin-
ical Psychology: From Linguistic Signal to Clinical
Reality, pages 38–45.
Green, S., Stiles, M., Harton, K., Garofalo, S., and Brown,
D. E. (2017). Computational analysis of religious and
ideological linguistic behavior. In Systems and In-
formation Engineering Design Symposium (SIEDS),
2017, pages 359–364. IEEE.
Greenawald, B., Liu, Y., Wert, G., Al Boni, M., and Brown,
D. E. (2018). A comparison of language dependent
and language independent models for violence predic-
tion. In Systems and Information Engineering Design
Symposium (SIEDS), In Press. IEEE.
Hacker, K., Boje, D., Nisbett, V., Abdelali, A., and Henry,
N. (2013). Interpreting iranian leaders’ conflict fram-
ing by combining latent semantic analysis and prag-
matist storytelling theory. In Political Communication
Division of the National Communication Association
annual conference, Washington, DC.
Kim, Y. (2014a). Convolutional neural networks for sen-
tence classification. CoRR, abs/1408.5882.
Kim, Y. (2014b). Convolutional neural networks for sen-
tence classification. arXiv preprint arXiv:1408.5882.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Kr
¨
oll, M. and Strohmaier, M. (2009). Analyzing human in-
tentions in natural language text. In Proceedings of the
fifth international conference on Knowledge capture,
pages 197–198. ACM.
Kutuzov, A., Velldal, E., and Øvrelid, L. (2017). Tempo-
ral dynamics of semantic relations in word embed-
dings: an application to predicting armed conflict par-
ticipants. In Proceedings of the 2017 Conference on
Empirical Methods in Natural Language Processing,
pages 1824–1829.
Landrum, N. E., Tomaka, C., and McCarthy, J. (2016). Ana-
lyzing the religious war of words over climate change.
Journal of Macromarketing, 36(4):471–482.
Le, Q. and Mikolov, T. (2014). Distributed representations
of sentences and documents. In International Confer-
ence on Machine Learning, pages 1188–1196.
Maaten, L. v. d. and Hinton, G. (2008). Visualizing data
using t-sne. Journal of machine learning research,
9(Nov):2579–2605.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013).
Efficient estimation of word representations in vector
space. CoRR, abs/1301.3781.
Nair, V. and Hinton, G. E. (2010). Rectified linear units
improve restricted boltzmann machines. In Proceed-
ings of the 27th international conference on machine
learning (ICML-10), pages 807–814.
Tieleman, T. and Hinton, G. (2012). Lecture 6.5-rmsprop:
Divide the gradient by a running average of its recent
magnitude. COURSERA: Neural networks for ma-
chine learning, 4(2):26–31.
Venuti, N., Sachtjen, B., McIntyre, H., Mishra, C., Hays,
M., and Brown, D. E. (2016). Predicting the tolerance
level of religious discourse through computational lin-
guistics. In Systems and Information Engineering De-
sign Symposium (SIEDS), 2016 IEEE, pages 309–314.
IEEE.
Yang, M., Wong, S. C., and Coid, J. (2010). The efficacy
of violence prediction: a meta-analytic comparison
of nine risk assessment tools. Psychological bulletin,
136(5):740.
Yang, Y. and Pedersen, J. O. (1997). A comparative study
on feature selection in text categorization. In ICML,
volume 97, pages 412–420.
Predicting Violent Behavior using Language Agnostic Models
109