6 CONCLUSION
In this paper a grammar and dictionary based named-
entity linking method is presented that can be used for
knowledge extraction of evidence-based dietary rec-
ommendations. The method works with dietary rec-
ommendation presented in one sentence. It consists
of two phases. The first one is a mix between the en-
tity detection and determination of a set of candidates
for each entity, while the second phase is a candidate
selection. The focus is on food related entities, nu-
trient related entities, and quantity/unit entities. The
method is evaluated using dietary recommendations
provided by the World Health Organization and the
U.S. National Library of Medicine.
To the best of our knowledge, this is the first
named-entity linking method that is focused on en-
tities related with dietary recommendations. In the
absence of labeled data needed for the current state
of the art machine learning approaches, the benefit
of this method is that can easily extracted the entities
from unstructured data. In the future we are planning
to extend this method to work with text paragraphs
and to extract all possible entities of interest.
In addition we provide a named-entity linking
method that can be used also in other domains, by
using appropriate dictionaries for the entities in those
domains.
REFERENCES
A.Voutilainen (2003). Part-of-speech tagging. The Oxford
handbook of computational linguistics, pages 219–
232.
Blanco, R., Boldi, P., and Marino, A. (2015). Using graph
distances for named-entity linking. Science of Com-
puter Programming.
Campos, D., Matos, S., and Oliveira, J. L. (2013). Chemical
name recognition with harmonized feature-rich condi-
tional random fields. In BioCreative Challenge Eval-
uation Workshop, volume 2, page 82.
Chowdhury, G. G. (2003). Natural language processing.
Annual review of information science and technology,
37(1):51–89.
EFSA ((accessed February 18, 2016)). European Food
safety Authority. https://www.efsa.europa.eu/.
Gkoutos, G. V., Schofield, P. N., and Hoehndorf, R. (2012).
The units ontology: a tool for integrating units of mea-
surement in science. Database, 2012:bas033.
Hachey, B., Radford, W., Nothman, J., Honnibal, M., and
Curran, J. R. (2013). Evaluating entity linking with
wikipedia. Artificial intelligence, 194:130–150.
Han, X., Sun, L., and Zhao, J. (2011). Collective entity
linking in web text: a graph-based method. In Pro-
ceedings of the 34th international ACM SIGIR con-
ference on Research and development in Information
Retrieval, pages 765–774. ACM.
Marcus, M. P., Marcinkiewicz, M. A., and Santorini, B.
(1993). Building a large annotated corpus of en-
glish: The penn treebank. Computational linguistics,
19(2):313–330.
McEnery, T. and Wilson, A. (2001). Corpus linguistics: An
introduction. Edinburgh University Press.
Mihalcea, R. and Csomai, A. (2007). Wikify!: linking doc-
uments to encyclopedic knowledge. In Proceedings of
the sixteenth ACM conference on Conference on infor-
mation and knowledge management, pages 233–242.
ACM.
Nelson, R. J. (1955). Karnaugh m.. the map method for syn-
thesis of combinational logic circuits. transactions of
the american institute of electrical engineers, vol. 72
part i (1953), pp. 593–598. The Journal of Symbolic
Logic, 20(02):197–197.
Nunes, T., Campos, D., Matos, S., and Oliveira, J. L.
(2013). Becas: biomedical concept recognition ser-
vices and visualization. Bioinformatics, page btt317.
Rayson, P., Archer, D., Piao, S., and McEnery, A. (2004).
The ucrel semantic analysis system.
Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., and
Mladenic, D. (2007). Triplet extraction from sen-
tences. In Proceedings of the 10th International Mul-
ticonference” Information Society-IS, pages 8–12.
Santorini, B. (1990). Part-of-speech tagging guidelines for
the penn treebank project (3rd revision).
Schmid, H. (1994). Probabilistic part-of-speech tagging us-
ing decision trees. In Proceedings of the international
conference on new methods in language processing,
volume 12, pages 44–49. Citeseer.
Taylor, A., Marcus, M., and Santorini, B. (2003). The penn
treebank: an overview. In Treebanks, pages 5–22.
Springer.
Tian, Y. and Lo, D. (2015). A comparative study on the
effectiveness of part-of-speech tagging techniques on
bug reports. In Software Analysis, Evolution and
Reengineering (SANER), 2015 IEEE 22nd Interna-
tional Conference on, pages 570–574. IEEE.
Vorster, H., Love, P., and Browne, C. (2001). Development
of food-based dietary guidelines for south africa: the
process. S Afr J Clin Nutr, 14(3).
Wilson, A. and Thomas, J. (1997). Semantic annotation.
Corpus Annotation. Longman, London.
Grammar and Dictionary based Named-entity Linking for Knowledge Extraction of Evidence-based Dietary Recommendations
157