and criteria is crucial for successful communication
in the project.
The first definition of usage goals was based on in-
complete information about the characteristics of doc-
uments and features. The discussion of intermediate
results with the participants led to the development
of suitable feature definitions and usage goals. Based
on this we could refine the evaluation criteria and the
rating functions for extracted values. The customers
gave the feedback that the approach is suitable and
helpful in the described setting.
In the end, we achieved a common understanding
of the influencing factors of feature extraction scenar-
ios for text, how these factors affect the evaluation
results and what solutions (or what kind of solutions)
are suitable to extract what kind of features.
In the following, we give an outlook on which im-
provements can be made especially during the evalu-
ation phase of text mining projects.
First of all, introducing a form of weighted F1
score could prove as useful. Depending on the ap-
plication scenario it might be more important to find
less, but reliable results. This may take into account
a higher rate of FNs (like the feature date of incident
in our scenario 1). In contrast, for use cases similar to
scenario 2, it might be wiser to rather allow a higher
rate of FPs in the document than losing some impor-
tant information.
Although not concerning the three features se-
lected from our industrial use case in this work, hav-
ing to deal with highly imbalanced occurrences of
features within the document sets causes well-known
challenges. For example, features with a small set
of target classes (like yes/no), predicting always ’yes’
may achieve a high evaluation score when the target
values occur very uneven. Here using Cohen’s kappa
(Cohen, 1960) instead of F1 score seems to be a good
option and will be investigated in the future.
REFERENCES
Apache Software Foundation (2019). Uima ruta.
https://uima.apache.org/ruta.html.
Capterra Inc. (2019). Text mining software.
https://www.capterra.com.de/directory/30933/text-
mining/software.
Carnerud, D. (2014). Exploration of text mining methodol-
ogy through investigation of qmod-icqss proceedings.
In 17th QMOD-ICQSS Conference, Prague, Czech
Republic.
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz,
T., Shearer, C., and Wirth, R. (2000). Crisp-dm 1.0
step-by-step data mining guide. Technical report, The
CRISP-DM consortium.
Chinchor, N. (1992). Muc-4 evaluation metrics. In Pro-
ceedings of the 4th Conference on Message Under-
standing, MUC4 ’92, page 22–29, USA. Association
for Computational Linguistics.
Cohen, J. (1960). A coefficient of agreement for nominal
scales. Educational and Psychological Measurement,
20(1):37–46.
Davis, M., Emmott, S., Brethenoux, E., and Vashisth,
S. (2019). Market guide for text analytics.
https://www.gartner.com/en/documents/3892564.
Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L.,
Strassel, S., and Weischedel, R. (2004). The automatic
content extraction (ace) program-tasks, data, and eval-
uation. Proceedings of LREC, 2.
Esuli, A. and Sebastiani, F. (2010). Evaluating informa-
tion extraction. In Agosti, M., Ferro, N., Peters,
C., de Rijke, M., and Smeaton, A., editors, Multilin-
gual and Multimodal Information Access Evaluation,
pages 100–111, Berlin, Heidelberg. Springer Berlin
Heidelberg.
Evelson, B. and Sridharan, S. and Perdoni,
R. (2019). The Forrester Wave
TM
: AI-
Based Text Analytics Platforms, Q2 2018.
https://www.forrester.com/report/The+Forrester+
Wave+AIBased+Text+Analytics+Platforms+Q2+
2018/-/E-RES141340.
Ferro, L., Gerber, L., Mani, I., Sundheim, B., and Wilson,
G. (2005). Tides 2005 standard for the annotation of
temporal expressions. Technical report, MITRE.
Google (2019). Tessseract ocr.
https://opensource.google/projects/tesseract.
Grishman, R. and Sundheim, B. (1996). Message un-
derstanding conference-6: A brief history. In Pro-
ceedings of the 16th Conference on Computational
Linguistics - Volume 1, COLING ’96, pages 466–
471, Stroudsburg, PA, USA. Association for Compu-
tational Linguistics.
Hamon, T. and Grabar, N. (2013). Extraction of ingredient
names from recipes by combining linguistic annota-
tions and crf selection. In Proceedings of the 5th Inter-
national Workshop on Multimedia for Cooking &
Eating Activities, CEA ’13, pages 63–68, New York,
NY, USA. ACM.
Jiang, R., Banchs, R. E., and Li, H. (2016). Evaluating and
combining name entity recognition systems. In Pro-
ceedings of the Sixth Named Entity Workshop, pages
21–27, Berlin, Germany. Association for Computa-
tional Linguistics.
Manning, C. D., Raghavan, P., and Sch
¨
utze, H. (2008). In-
troduction to Information Retrieval. Cambridge Uni-
versity Press, New York, NY, USA.
Nadeau, D. and Sekine, S. (2007). A survey of named entity
recognition and classification. Lingvisticae Investiga-
tiones, 30.
Onan, A., Koruko
˘
glu, S., and Bulut, H. (2016). Ensem-
ble of keyword extraction methods and classifiers in
text classification. Expert Systems with Applications,
57:232 – 247.
PAT Research (2019). Top software for
text analysis, text mining, text analytics.
ICPRAM 2020 - 9th International Conference on Pattern Recognition Applications and Methods
190