Moving Other Way: Exploring Word Mover Distance Extensions
Ilya Smirnov, Ivan Yamshchikov
2022
Abstract
The word mover’s distance (WMD) is a popular semantic similarity metric for two documents. This metric is quite interpretable and reflects the similarity well, but some aspects can be improved. This position paper studies several possible extensions of WMD. We introduce some regularizations of WMD based on a word match and the frequency of words in the corpus as a weighting factor. Besides, we calculate WMD in word vector spaces with non-Euclidean geometry and compare it with the metric in Euclidean space. We validate possible extensions of WMD on six document classification datasets. Some proposed extensions show better results in terms of the k-nearest neighbor classification error than WMD.
DownloadPaper Citation
in Harvard Style
Smirnov I. and Yamshchikov I. (2022). Moving Other Way: Exploring Word Mover Distance Extensions. In Proceedings of the 7th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS, ISBN 978-989-758-565-4, pages 92-97. DOI: 10.5220/0011096900003197
in Bibtex Style
@conference{complexis22,
author={Ilya Smirnov and Ivan Yamshchikov},
title={Moving Other Way: Exploring Word Mover Distance Extensions},
booktitle={Proceedings of the 7th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,},
year={2022},
pages={92-97},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011096900003197},
isbn={978-989-758-565-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,
TI - Moving Other Way: Exploring Word Mover Distance Extensions
SN - 978-989-758-565-4
AU - Smirnov I.
AU - Yamshchikov I.
PY - 2022
SP - 92
EP - 97
DO - 10.5220/0011096900003197