ACKNOWLEDGEMENTS
The authors would like to thank Dr Albert Gatt for al-
lowing the use of a GPU server. We would also like
to thank Dr Lonneke van der Plas, Dr Stavros Assi-
makopoulos and Ms Rebekah Vella Muskat for pro-
viding the dataset used in this study.
REFERENCES
Aggarwal, C. C., Hinneburg, A., and Keim, D. A. (2001).
On the surprising behavior of distance metrics in high
dimensional space. In ICDT ’01, pages 420–434, Lon-
don, United Kingdom.
Allison, B., Guthrie, D., and Guthrie, L. (2006). Another
look at the data sparsity problem. In TSD ’06, pages
327–334, Brno, Czech Republic.
Assimakopoulos, S., Vella Muskat, R., van der Plas, L.,
and Gatt, A. (2020). Annotating for hate speech:
The MaNeCo corpus and some input from critical dis-
course analysis. In LREC ’20, pages 5088–5097, Mar-
seille, France. ELRA.
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C.
(2003). A neural probabilistic language model. J
Mach Learn Res, 3(Feb):1137–1155.
Bengio, Y. and LeCun, Y. (2007). Scaling learning algo-
rithms towards AI. In Bottou, L., Chapelle, O., De-
Coste, D., and Weston, J., editors, Large scale kernel
machines. MIT Press, Cambridge, MA.
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T.
(2017). Enriching word vectors with subword infor-
mation. Trans Assoc Comput Linguist, 5:135–146.
Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., and
Kalai, A. T. (2016). Man is to computer program-
mer as woman is to homemaker? Debiasing word em-
beddings. In NIPS ’16, pages 4349–4357, Barcelona,
Spain.
Brown, P. F., Della Pietra, V. J., deSouza, P. V., Lai, J. C.,
and Mercer, R. L. (1992). Class-based n-gram models
of natural language. Comput Linguist, 18(4):467–480.
Costola, M., Nofer, M., Hinz, O., and Pelizzon, L. (2020).
Machine learning sentiment analysis, COVID-19
news and stock market reactions. SAFE Working Pa-
per.
Grech, B. and Suda, D. (2020). A neural information re-
trieval approach for r
´
esum
´
e searching in a recruitment
agency. In ICPRAM ’20, pages 645–651, Valletta,
Malta. SciTePress Digital Library.
Guthrie, D., Allison, B., Liu, W., Guthrie, L., and Wilks,
Y. (2006). A closer look at skip-gram modelling. In
LREC ’06, Genoa, Italy. ELRA.
Iyyer, M., Enns, P., Boyd-Graber, J., and Resnik, P. (2014).
Political ideology detection using recursive neural net-
works. In ACL-IJCNLP ’14, Volume 1, pages 1113–
1122, Baltimore, MD.
Jacobi, C., Van Atteveldt, W., and Welbers, K. (2016).
Quantitative analysis of large amounts of journalistic
texts using topic modelling. Digit Journal, 4(1):89–
106.
Le, Q. and Mikolov, T. (2014). Distributed representations
of sentences and documents. In ICML ’14, pages
1188–1196, Beijing, China.
Liu, B. (2015). Sentiment Analysis: Mining Opinions, Sen-
timents, and Emotions, chapter 10: Analysis of De-
bates and Comments, page 231–249. Cambridge Uni-
versity Press, Cambridge, UK.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013a).
Efficient estimation of word representations in vector
space. In Bengio, Y. and LeCun, Y., editors, ICLR ’13,
Scottsdale, AZ.
Mikolov, T., Karafi
´
at, M., Burget, L.,
ˇ
Cernock
`
y, J., and
Khudanpur, S. (2010). Recurrent neural network
based language model. In INTERSPEECH ’10,
Makuhari, Japan.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and
Dean, J. (2013b). Distributed representations of words
and phrases and their compositionality. In NIPS ’13,
pages 3111–3119, Stateline, NV.
Mouselimis, L. (2019). textTinyR: Text processing for
small or big data files. Retrieved from https://CRAN.
R-project.org/package=textTinyR on September 22,
2020.
Mukherjee, A. and Liu, B. (2012). Mining contentions from
discussions and debates. ACM-SIGKDD ’12, pages
841–849.
Pibiri, G. E. and Venturini, R. (2019). Handling mas-
sive n-gram datasets efficiently. ACM Trans Inf Syst,
37(2):1–41.
Pickhardt, R., Gottron, T., K
¨
orner, M., Wagner, P. G., Spe-
icher, T., and Staab, S. (2014). A generalized language
model as the combination of skipped n-grams and
modified Kneser-Ney smoothing. arXiv:1404.3377.
R
˘
adulescu, C., Dinsoreanu, M., and Potolea, R. (2014).
Identification of spam comments using natural lan-
guage processing techniques. In ICCP ’2014, pages
29–35. IEEE.
Socher, R., Bauer, J., Manning, C. D., and Ng, A. Y.
(2013a). Parsing with compositional vector gram-
mars. In ACL-IJCNLP ’13, Volume 1, pages 455–465,
Nagoya, Japan.
Socher, R., Lin, C. C., Manning, C. D., and Ng, A. Y.
(2011). Parsing natural scenes and natural language
with recursive neural networks. In ICML ’11, pages
129–136, Bellevue, WA.
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning,
C. D., Ng, A. Y., et al. (2013b). Recursive deep mod-
els for semantic compositionality over a sentiment
treebank. In EMNLP ’13, pages 1631–1642, Seattle,
WA.
Times of Malta (2020). Comment policy - Times of
Malta. Retrieved from https://timesofmalta.com/
comments on September 24, 2020.
Zaidan, O. F. and Callison-Burch, C. (2014). Arabic dialect
identification. Comput Linguist, 40(1):171–202.
Common Topic Identification in Online Maltese News Portal Comments
555