applications. In Journal of the American Medical
Association.
Friedman, C., Shagina, L., Socratous, S. A., Zeng, X., 1996.
A WEB-based version of MedLEE: A medical
language extraction and encoding system. In:
Proceedings of the AMIA Annual Fall Symposium.
Friedman, C., 2000. A broad-coverage natural language
processing system. In Proceedings of the AMIA
Symposium. American Medical Informatics
Association.
Garla, V. N., Brandt, C., 2013. Knowledge-based
biomedical word sense disambiguation: an evaluation
and application to clinical document classification. In
Journal of the American Medical Informatics
Association.
Collier, N., Oellrich, A., Groza, T., 2015. Concept selection
for phenotypes and diseases using learn to rank. In
Journal of Biomedical Semantics.
Fu, X., Batista-Navarro, R, Rak, R, Ananiadou, S., 2014. A
strategy for annotating clinical records with phenotypic
information relating to the chronic obstructive
pulmonary disease. In Proceedings of Phenotype Day
at ISMB.
Fan, J., Sood, N., Huang, Y., 2013. Disorder concept
identification from clinical notes an experience with the
ShARe/CLEF 2013 challenge. In Proceedings of the
ShARe/CLEF Evaluation Lab.
Ramanan, S., Broido, S., Nathan, P. S., 2013. Performance
of a Multi-class Biomedical Tagger on Clinical
Records. In Proceedings of the ShARe/CLEF
Evaluation Lab.
Wang, C., Akella, R., 2013. UCSC's System for CLEF
eHealth. In: Proceedings of the ShARe/CLEF
Evaluation Lab.
Goudey, B., Stokes, N, Martinez, D., 2007. Exploring
Extensions to Machine Learning-based Gene
Normalisation. In Proceedings of the Australasian
Language Technology Workshop.
Leaman, R., Doğan, R. I., Lu, Z., 2013. DNorm: disease
name normalization with pairwise learning to rank. In
Bioinformatics.
Doğan, R. I., Lu, Z., 2012. An inference method for disease
name normalization. In Proceedings of the 2012 AAAI
Fall Symposium Series.
Kate, R. J., 2015. Normalizing clinical terms using learned
edit distance patterns. In Journal of the American
Medical Informatics Association.
Jaro, M. A., 1995. Probabilistic linkage of large public
health data files. In Statistics in medicine.
Winkler, W. E., 1999. The state of record linkage and
current research problems. In Statistical Research
Division, US Census Bureau.
Kondrak, G., 2005. N-gram similarity and distance. In
String Processing and Information Retrieval. Springer,
Berlin, Heidelberg.
Jaccard, P., 1912. The distribution of the flora in the alpine
zone. In New Phytologist.
Moreau, E., Yvon, F., Cappe, O., 2008. Robust similarity
measures for named entities matching. In Proceedings
of the 22nd International Conference on Computational
Linguistics.
Cohen, W., Ravikumar, P, Fienberg S., 2003. A comparison
of string metrics for matching names and records. In
Proceedings of the KDD workshop on data cleaning
and object consolidation.
Alnazzawi, N., Thompson, P., Ananiadou, S., 2016.
Mapping Phenotypic Information in Heterogeneous
Textual Sources to a Domain-Specific Terminological
Resource. In PLOS ONE 11.
USDA, Food Composition Database, accessed April 2017.
https://ndb.nal.usda.gov/ndb/.
NDB API, accessed April 2017. https://ndb.nal.usda.gov/
ndb/doc/index.
Development Core Team, 2008. R: A language and
environment for statistical computing. http://www.R-
project.org.
RStudio Team, 2015. RStudio: Integrated Development for
R. https://www.rstudio.com/. RStudio Inc., Boston. R
Foundation for Statistical Computing, Vienna.
Eftimov, T., Koroušić-Seljak, B., Korošec, P., 2017. A rule-
based Named-entity Recognition Method for
Knowledge Extraction of Evidence-based Dietary
Recommendations. In PLOS ONE.
Eftimov, T., Korošec, P., Koroušić-Seljak, B., 2017.
StandFood: Standardization of Foods Using a Semi-
Automatic System for Classifying and Describing
Foods According to FoodEx2. In Nutrients.
Eftimov, T., Koroušić-Seljak, B., 2017. POS Tagging-
probability Weighted Method for Matching the Internet
Recipe Ingredients with Food Composition Data. In
Proceedings of the 7th International Joint Conference
on Knowledge Discovery, Knowledge Engineering and
Knowledge Management (IC3K 2015).
Van der Loo, M., Van der Laan, J., Logan, N., 2016.
Approximate String Matching and String Distance
Functions.
Leida, M., Ceravolo, P., Damiani, E., Cui, Z., Gusmini, A.,
2010. Semantics-aware matching strategy (SAMS) for
the Ontology meDiated Data Integration (ODDI). In
Int. J. Knowledge Engineering and Soft Data
Paradigms, Vol. 2, No. 1.
Kerzazi, A., Navas-Delgado, I., F.Aldana-Montes, J., 2009.
Towards an Ontology-based Mediation Framework for
Integrating Biological Data. In SWAT4LS.