6 CONCLUSIONS AND FUTURE
WORK
Named Entity Recognition (NER) is a famous task in
Natural Language Processing that is aimed at detect-
ing different entities in a given text. Natural Language
Processing analysis in the satellite domain is neces-
sary because of the increasing data growth in this do-
main and also the importance of this domain. In this
paper, we present an effective NER model specifi-
cally engineered for the satellite domain called Satel-
liteNER. To build this model, we generate training,
validation, and testing datasets in an automated man-
ner. By doing this, the dataset annotation can happen
at a fast pace without the need for a human to manu-
ally perform the annotation task. Experiments using
three different testing strategies show the benefit of
SatelliteNER over existing NER tools. In the future,
we plan to improve accuracy by including human-in-
the-loop in the labeling process and by fine-tuning the
underlying neural network parameters. Furthermore,
we intend to build transformer-based custom models
that can achieve a higher accuracy.
REFERENCES
Abacha, A. B. and Zweigenbaum, P. (2011). Medical entity
recognition: A comparaison of semantic and statisti-
cal methods. In Proceedings of BioNLP 2011 Work-
shop, pages 56–64.
Collobert, R. and Weston, J. (2008). A unified architec-
ture for natural language processing: Deep neural net-
works with multitask learning. In Proceedings of the
25th international conference on Machine learning,
pages 160–167.
Dozier, C., Kondadadi, R., Light, M., Vachher, A., Veera-
machaneni, S., and Wudali, R. (2010). Named entity
recognition and resolution in legal text. In Semantic
Processing of Legal Texts, pages 27–43. Springer.
Eddy, S. R. (1996). Hidden markov models. Current opin-
ion in structural biology, 6(3):361–365.
Finkel, J. R., Grenager, T., and Manning, C. D. (2005).
Incorporating non-local information into information
extraction systems by gibbs sampling. In Proceed-
ings of the 43rd Annual Meeting of the Association for
Computational Linguistics (ACL’05), pages 363–370.
Florian, R., Ittycheriah, A., Jing, H., and Zhang, T. (2003).
Named entity recognition through classifier combina-
tion. In Proceedings of the seventh conference on Nat-
ural language learning at HLT-NAACL 2003, pages
168–171.
Graves, A., Mohamed, A.-r., and Hinton, G. (2013).
Speech recognition with deep recurrent neural net-
works. In 2013 IEEE international conference on
acoustics, speech and signal processing, pages 6645–
6649. IEEE.
Grover, C., Givon, S., Tobin, R., and Ball, J. (2008). Named
entity recognition for digitised historical texts. In
LREC.
Hassel, M. (2003). Exploitation of named entities
in automatic text summarization for swedish. In
NODALIDA’03–14th Nordic Conferenceon Compu-
tational Linguistics, Reykjavik, Iceland, May 30–31
2003, page 9.
Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., and
Scholkopf, B. (1998). Support vector machines. IEEE
Intelligent Systems and their applications, 13(4):18–
28.
Huang, Z., Xu, W., and Yu, K. (2015). Bidirectional
lstm-crf models for sequence tagging. arXiv preprint
arXiv:1508.01991.
Jiang, R., Banchs, R. E., and Li, H. (2016). Evaluating and
combining name entity recognition systems. In Pro-
ceedings of the Sixth Named Entity Workshop, pages
21–27.
Lafferty, J., McCallum, A., and Pereira, F. C. (2001). Con-
ditional random fields: Probabilistic models for seg-
menting and labeling sequence data.
Nadeau, D. and Sekine, S. (2007). A survey of named entity
recognition and classification. Lingvisticae Investiga-
tiones, 30(1):3–26.
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., and Manning, C. D.
(2020). Stanza: A python natural language process-
ing toolkit for many human languages. arXiv preprint
arXiv:2003.07082.
Quinlan, J. R. (1986). Induction of decision trees. Machine
learning, 1(1):81–106.
Rau, L. F. (1991). Extracting company names from text.
In Proceedings The Seventh IEEE Conference on Ar-
tificial Intelligence Application, pages 29–30. IEEE
Computer Society.
Ribeiro, M. T., Wu, T., Guestrin, C., and Singh, S. (2020).
Beyond accuracy: Behavioral testing of nlp models
with checklist. arXiv preprint arXiv:2005.04118.
Ritter, A., Clark, S., Etzioni, O., et al. (2011). Named entity
recognition in tweets: an experimental study. In Pro-
ceedings of the 2011 conference on empirical methods
in natural language processing, pages 1524–1534.
Schmitt, X., Kubler, S., Robert, J., Papadakis, M., and Le-
Traon, Y. (2019). A replicable comparison study of
ner software: Stanfordnlp, nltk, opennlp, spacy, gate.
In 2019 Sixth International Conference on Social Net-
works Analysis, Management and Security (SNAMS),
pages 338–343. IEEE.
talkwalker.com (2020). https://www.talkwalker.com/social-
media-analytics-search.
Thompson, P. and Dozier, C. (1997). Name searching and
information retrieval. In Second Conference on Em-
pirical Methods in Natural Language Processing.
Union of Concerned Scientists (2020).
https://www.ucsusa.org/resources/satellite-database.
Won, M., Murrieta-Flores, P., and Martins, B. (2018). en-
semble named entity recognition (ner): evaluating ner
tools in the identification of place names in historical
corpora. Frontiers in Digital Humanities, 5:2.
SatelliteNER: An Effective Named Entity Recognition Model for the Satellite Domain
107