ings of the 2019 Conference of the North American
Chapter of the Association for Computational Lin-
guistics (Demonstrations), pages 54–59.
Akbik, A., Blythe, D., and Vollgraf, R. (2018). Contex-
tual String Embeddings for Sequence Labeling. In
Proceedings of the 27th International Conference on
Computational Linguistics, pages 1638–1649.
Araci, D. (2019). FinBERT: Financial Sentiment Analysis
with Pre-Trained Language Models. arXiv preprint
arXiv:1908.10063.
Azeraf, E., Monfrini, E., and Pieczynski, W. (2021a). Us-
ing the Naive Bayes as a Discriminative Model. In
Proceedings of the 13th International Conference on
Machine Learning and Computing, pages 106–110.
Azeraf, E., Monfrini, E., Vignon, E., and Pieczynski, W.
(2021b). Highly Fast Text Segmentation With Pair-
wise Markov Chains. In 6th IEEE Congress on Infor-
mation Science and Technology, pages 361–366.
Azeraf, E., Monfrini, E., Vignon, E., and Pieczynski, W.
(2021c). Introducing the Hidden Neural Markov
Chain Framework. In Proceedings of the 13th Inter-
national Conference on Agents and Artificial Intelli-
gence - Volume 2, pages 1013–1020.
Barbieri, F., Camacho-Collados, J., Espinosa-Anke, L., and
Neves, L. (2020). TweetEval: Unified Benchmark and
Comparative Evaluation for Tweet Classification. In
Proceedings of Findings of EMNLP.
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T.
(2017). Enriching Word Vectors with Subword Infor-
mation. Transactions of the Association for Computa-
tional Linguistics, 5:135–146.
Brants, T. (2000). TnT: a Statistical Part-of-Speech Tag-
ger. In Proceedings of the 6th Conference on Applied
Natural Language Processing, pages 224–231.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2019). BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding. In Pro-
ceedings of the 2019 Conference of the North Amer-
ican Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume
1 (Long and Short Papers), pages 4171–4186.
Devroye, L., Gy
¨
orfi, L., and Lugosi, G. (2013). A Prob-
abilistic Theory of Pattern Recognition, volume 31.
Springer Science & Business Media.
Duda, R. O., Hart, P. E., et al. (2006). Pattern Classification.
John Wiley & Sons.
Fukunaga, K. (2013). Introduction to Statistical Pattern
Recognition. Elsevier.
Ian Goodfellow, Y. B. and Courville, A. (2016). Deep
Learning. MIT Press.
Jebara, T. (2012). Machine Learning: Discriminative and
Generative. Springer Science & Business Media.
Joulin, A., Grave, E., Bojanowski, P., and Mikolov, T.
(2017). Bag of Tricks for Efficient Text Classifica-
tion. In Proceedings of the 15th Conference of the Eu-
ropean Chapter of the Association for Computational
Linguistics: Volume 2, Short Papers, pages 427–431,
Valencia, Spain.
Jurafsky, D. and Martin, J. H. (2009). Speech and Lan-
guage Processing: An Introduction to Natural Lan-
guage Processing, Speech Recognition, and Compu-
tational Linguistics, 2nd Edition. Prentice-Hall.
Kim, S.-B., Han, K.-S., Rim, H.-C., and Myaeng, S. H.
(2006). Some Effective Techniques for Naive Bayes
Text Classification. IEEE Transactions on Knowledge
and Data Engineering, 18(11):1457–1466.
Kingma, D. P. and Ba, J. (2014). Adam: A
Method for Stochastic Optimization. arXiv preprint
arXiv:1412.6980.
Koller, D. and Friedman, N. (2009). Probabilistic Graphi-
cal Models: Principles and Techniques. MIT press.
Komninos, A. and Manandhar, S. (2016). Dependency
Based Embeddings for Sentence Classification Tasks.
In Proceedings of the 2016 conference of the North
American Chapter of the Association for Computa-
tional Linguistics: Human Language Technologies,
pages 1490–1500.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep Learn-
ing. Nature, 521(7553):436–444.
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard,
R. E., Hubbard, W., and Jackel, L. D. (1989). Back-
propagation Applied to Handwritten Zip Code Recog-
nition. Neural computation, 1(4):541–551.
Lhoest, Q., Villanova del Moral, A., Jernite, Y., Thakur,
A., von Platen, P., Patil, S., Chaumond, J., Drame,
M., Plu, J., Tunstall, L., Davison, J.,
ˇ
Sa
ˇ
sko, M., Chh-
ablani, G., Malik, B., Brandeis, S., Le Scao, T., Sanh,
V., Xu, C., Patry, N., McMillan-Major, A., Schmid,
P., Gugger, S., Delangue, C., Matussi
`
ere, T., Debut,
L., Bekman, S., Cistac, P., Goehringer, T., Mustar, V.,
Lagunas, F., Rush, A., and Wolf, T. (2021). Datasets:
A Community Library for Natural Language Process-
ing. In Proceedings of the 2021 Conference on Empir-
ical Methods in Natural Language Processing: Sys-
tem Demonstrations.
Liu, B., Blasch, E., Chen, Y., Shen, D., and Chen, G. (2013).
Scalable Sentiment Classification for Big Data Anal-
ysis using Naive Bayes Classifier. In IEEE Interna-
tional Conference on Big Data, pages 99–104.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). Roberta: A Robustly Optimized BERT Pre-
training Approach. arXiv preprint arXiv:1907.11692.
Loshchilov, I. and Hutter, F. (2016). SGDR: Stochastic
Gradient Descent with Warm Restarts. arXiv preprint
arXiv:1608.03983.
Maas, A. L., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y.,
and Potts, C. (2011). Learning Word Vectors for Sen-
timent Analysis. In Proceedings of the 49th Annual
Meeting of the Association for Computational Lin-
guistics: Human Language Technologies, pages 142–
150, Portland, Oregon, USA. Association for Compu-
tational Linguistics.
Malo, P., Sinha, A., Korhonen, P., Wallenius, J., and Takala,
P. (2014). Good Debt or Bad Debt: Detecting Se-
mantic Orientations in Economic Texts. Journal of the
Association for Information Science and Technology,
65(4):782–796.
McCallum, A., Nigam, K., et al. (1998). A Comparison of
Event Models for Naive Bayes Text Classification. In
Improving Usual Naive Bayes Classifier Performances with Neural Naive Bayes based Models
321