commented: “The only thing that matters in the long
run is the leveraging of computation” (Sutton, 2019).
Certainly, as this project has progressed, we have only
found better hardware and new improved language
models very helpful.
5 CONCLUSIONS
This study demonstrates that NLP transformers have
the capability to do semantic analysis of Danish job
ad texts. Optimization led to labeling precision in the
95% range compared to human beings labeling com-
petences in demand in the same ads. The inter-coder
reliability for two people manually categorizing the
same job ads competences yielded a kappa statistic
of k = .75. Therefore, the findings of this paper sup-
port the claim that NLP transformers can do semantic
analysis at a precision level comparable to humans.
The demonstration of semantic text analysis done by
NLP transformers used on Danish job ad texts en-
ables the possibility to automate the monitorization
of demanded competences at the Danish labor mar-
ket. Such monitorization will benefit adaption of edu-
cational programs and guidance of employed towards
vacancies.
We are now able to fully analyze smaller batches
of preselected job ads. However, further improve-
ments to our current prototype are needed before we
can realisticly approach full-scale monitorization of
the Danish labor market. Where a future system needs
to analyze approximately 500,000 yearly Danish job
ads, each to be compared with skill sets described in
educational course materials, and categorized accord-
ing to the approximately 14,000 competences pro-
vided by the project ”European Skills, Competences,
Qualifications and Occupations” (ESCO, 2023).
REFERENCES
Adamopoulou, E. and Moussiades, L. (2020). An overview
of chatbot technology. In Artificial Intelligence Ap-
plications and Innovations: 16th IFIP WG 12.5 In-
ternational Conference, AIAI 2020, Neos Marmaras,
Greece, June 5–7, 2020, Proceedings, Part II 16,
pages 373–383. Springer.
Albrecht, J., Ramachandran, S., and Winkler, C. (2020).
Blueprints for Text Analytics Using Python. O’Reilly
Media, Inc.
Alfeo, A. L., Cimino, M. G. C. A., and Vaglini, G. (2021).
Technological troubleshooting based on sentence em-
bedding with deep transformers. https://doi.org/10.
1007/s10845-021-01797-w.
Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., and
Krathwohl, D. R. (1956). TAXONOMY OF EDUCA-
TIONAL OBJECTIVES, The Classification of Educa-
tional Goals. LONGMANS.
Boehler, J. A., Larson, B., and Shehane, R. F. (2020). Eval-
uation of Information Systems Curricula. Journal of
Information Systems Education, 31(3):232–243.
Chowdhary, K. R. (2020). Natural language processing. In
Chowdhary, K., editor, Fundamentals of Artificial In-
telligence, pages 603–649. Springer India. 10.1007/
978-81-322-3972-7 19.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2018). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. https://arxiv.org/
abs/1810.04805.
ESCO (2023). European Skills, Competences, Qualifica-
tions and Occupations. https://esco.ec.europa.eu/da/
classification/skill main/.
eurostat (2023). Labour market. https://ec.europa.eu/
eurostat/web/labour-market/.
Gartner (2023). Gartner. https://www.gartner.com/.
Geron, A. (2022). Hands-On Machine Learning with Scikit-
Learn, Keras, and TensorFlow 3e - Concepts, Tools,
and Techniques to Build Intelligent Systems, volume
2022. O’Reilly Media, Inc., 3rd edition.
Git (2023). Project experiments and prototype code
on github. https://github.com/SimonLaub/NLP
JobTrend.
Gromov, A., Maslennikov, A., Dawson, N., Musial, K., and
Kitto, K. (2020). Curriculum profile: modelling the
gaps between curriculum and the job market. Educa-
tional Data Mining 2020.
HuggingFace repository (2022). Sentence transformer.
https://huggingface.co/sentence-transformers/
multi-qa-MiniLM-L6-cos-v1/.
Khyani, D., Siddhartha, B., Niveditha, N., and Divya,
B. (2021). An interpretation of lemmatization and
stemming in natural language processing. Journal of
University of Shanghai for Science and Technology,
22(10):350–357.
Krathwohl, D. R. (2002). A Revision of Bloom’s Taxon-
omy: An Overview. Theory Into Practice, 41(4):212–
218.
NLTK (2023a). Nltk. https://www.nltk.org/.
NLTK (2023b). scikit-learn - machine learning in python.
https://scikit-learn.org/.
OpenAI (2023). Openai. https://openai.com/.
Ormerod, M., Mart
´
ınez Del Rinc
´
on, J., and Devereux, B.
(2021). Predicting semantic similarity between clini-
cal sentence pairs using transformer models: Evalua-
tion and representational analysis. https://medinform.
jmir.org/2021/5/e23099/.
Pejic-Bach, M., Bertoncel, T., Mesko, M., and Krstic, Z.
(2020). Text mining of industry 4.0 job advertise-
ments. International journal of information manage-
ment, 50:416–431. Publisher: Elsevier.
Reimers, N. and Gurevych, I. (2019). Sentence-BERT:
Sentence Embeddings using Siamese BERT-Networks.
Association for Computational Linguistics. Proceed-
ings of the 2019 Conference on Empirical Methods in
Natural Language Processing.
A Transformer Based Semantic Analysis of (non-English) Danish Jobads
365