covers the engines for the RegExp rebuilding, the
eNER retraining (for the ML based eNER), the re-
application of the eNER (both ML and RegExp
based) and finally the statistical analysis of the
newly detected eNE-candidates. On the interaction
layer you find the interactive visualization of the
statistical results, of identified eNEs and the
integrated GUI for collecting feedback and context
on the eNE-candidates. The Effectuation layer of the
BDMCube is intended to create added value for the
user by providing the intelligence for supporting the
underlying use cases. In our project this layer
contains the interfaces to RecomRatio. It provides
the functionalities for the individual and the
comprehensive use cases to be integrated into the
both projects’ IR GUIs.
5 CONCLUSION AND OUTLOOK
In this paper we introduced our concept of emerging
Named Entities, eNEs. With the experiments we
were able to show how eNEs can represent emerging
knowledge in clinical VREs and hence may be used
to support IR and Argumentation Support in clinical
VREs for both individual and comprehensive use
cases. Following these two main contributions –
definition of eNEs and the results of the experiments
– we discussed our proposal for a framework which
can recognize eNEs by combining NLP, statistical
methods, ML and expert user feedback and make
eNEs usable for individual and comprehensive use
cases in clinical VREs. The next steps in our work
are the prototypical implementation and evaluation
of the proposed framework, including the
visualization component, the design of the core
component, the development of statistical patterns to
identify eNE-candidates and foremost a user survey
about search practice of medical staff in clinical
VREs. The objective of the user survey is to find
typical search patterns used by clinicians when
searching for arguments as well as to figure out their
expected outcome of the search (ranking and
visualization). In addition, with the survey we want
to investigate whether clinicians use recent
(emergent) vocabulary for search and argumentation
or whether they rely on traditional wording. The
results of the survey are intended to optimize
baseline eNER (“seed”) and statistical patterns as
well as aligning visualization and ranking principles
in the IR GUI based on the expert users’ actual
needs.
REFERENCES
Aronson, A.R. (2001) ‘Effective mapping of biomedical
text to the UMLS Metathesaurus: the MetaMap
program’, Proceedings of the AMIA Symposium,
pp. 17–21.
Aronson, A.R. and Lang, F.-M. (2010) ‘An overview of
MetaMap: historical perspective and recent advances’,
Journal of the American Medical Informatics
Association, 17(3), pp. 229–236. doi: 10.1136/
jamia.2009.002733
Bada, M. et al. (2012) ‘Concept annotation in the CRAFT
corpus’, BMC Bioinformatics, 13(1), p. 161.
Bawden, D. and Robinson, L. (2009) ‘The dark side of
information: Overload, anxiety and other paradoxes
and pathologies’, Journal of Information Science,
35(2), pp. 180–191. doi: 10.1177/0165551508095781
Chang, A.X. and Manning, C.D. (2014) TokensRegex:
Defining cascaded regular expressions over tokens
(CSTR 2014-02).
Deubzer, S., Dietrich, K. and Goller, D. (2016) ‘Named
Entity Recognition mit eBay Auktionsartikeln:
Erstellen eines three-class-models mit
Smartphonedaten’, Informatik Spektrum, 39(05),
pp. 373–380.
DFG (2016) Schwerpunktprogramm „Robust
Argumentation Machines“ (SPP 1999), 27 June.
Available at:
http://www.dfg.de/foerderung/info_wissenschaft/2016
/info_wissenschaft_16_38/index.html (Accessed: 7
March 2018).
Eljasik-Swoboda, T., Kaufmann, M. and Hemmje, M.
(2018) ‘No Target Function Classifier - Fast
Unsupervised Text Categorization Using Semantic
Spaces’, in Submitted to Proceedings of DATA 2018.
Finin, T. et al. (2010) ‘Annotating Named Entities in
Twitter Data with Crowdsourcing’, Proceedings of the
NAACL HLT 2010 Workshop on Creating Speech and
Language Data with Amazon’s Mechanical Turk.
Stroudsburg, PA, USA: Association for Computational
Linguistics, pp. 80–88. Available at: http://dl.acm.org/
citation.cfm?id=1866696.1866709.
Finkel, J.R., Grenager, T. and Manning, C. (2005)
Incorporating non-local information into information
extraction systems by gibbs sampling: Association for
Computational Linguistics.
Garmire, L.X. et al. (2016) ‘The Training of Next
Generation Data Scientists in Biomedicine’, Pacific
Symposium on Biocomputing. Pacific Symposium on
Biocomputing, 22, pp. 640–645.
Grishman, R. (1995) Namend Entity Task Defintion, 31
May. Available at: http://www.cs.nyu.edu/cs/faculty/
grishman/NEtask20.book_1.html (Accessed: 26 May
2016).
Hearst, M.A. et al. (1998) ‘Support vector machines’,
IEEE Intelligent Systems and their applications, 13(4),
pp. 18–28.
Hunter, A. and Williams, M. (2015) ‘Aggregation of
Clinical Evidence Using Argumentation: A Tutorial
Introduction’, in Hommersom, A. and Lucas, P.J.F.