5 CONCLUSIONS
When building and deploying a ML model, the train-
ing stage one is the most time-consuming. In order to
reduce the training time of models using time series
datasets, syntactic similarity can be used.
In order to test this assumption, this paper de-
signed a case study based on MNOs scenario used to
predict client mobility. In this case study, a set of ML
models, namely RNN, GRU, LSTM and CNN, were
tested with the integration of the W2V to reduce the
training time.
The experimental results show that as the num-
ber of labels that can be predicted increases, the train-
ing time also increases. When comparing the results
obtained for different architectures, the use of pre-
trained weights from W2V presented a training time
reduction ratio between 22% and 43% and improved
the validation accuracy between 1 pp and 7 pps result-
ing in an overall gain of 3 pps.
These results show that the integration of W2V
pre-trained weights at the initial training stage may
reduce the ML training time in other scenarios such
as forecasting mobility in networks, epidemic control,
transportation management, urban planning, location-
based services, or management of cellular network re-
sources.
REFERENCES
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen,
Z., Citro, C., Corrado, G. S., Davis, A., Dean, J.,
Devin, M., Ghemawat, S., Goodfellow, I., Harp, A.,
Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser,
L., Kudlur, M., Levenberg, J., Man
´
e, D., Monga, R.,
Moore, S., Murray, D., Olah, C., Schuster, M., Shlens,
J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P.,
Vanhoucke, V., Vasudevan, V., Vi
´
egas, F., Vinyals,
O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y.,
and Zheng, X. (2015). TensorFlow: Large-scale ma-
chine learning on heterogeneous systems. Web site.
Software available from tensorflow.org (accessed on 1
February 2021).
Anaconda Software Distribution (2020). Anaconda soft-
ware distribution. Web site. Software available from
https://www.anaconda.com (accessed on 1 February
2021).
Chollet, F. et al. (2015). Keras. Web site. Software avail-
able from https://github.com/fchollet/keras (accesse-
don1 February 2021).
Collobert, R., Farabet, C., and Kavukcuo
˘
glu, K. (2008).
Torch — Scientific computing for LuaJIT. NIPS Work-
shop on Machine Learning Open Source Software.
Fan, X., Guo, L., Han, N., Wang, Y., Shi, J., and Yuan,
Y. (2018). A Deep Learning Approach for Next Lo-
cation Prediction. Proceedings of the 2018 IEEE
22nd International Conference on Computer Sup-
ported Cooperative Work in Design, CSCWD 2018,
(May 2018):630–635.
Gonz
´
alez, M. C., Hidalgo, C. A., and Barab
´
asi, A. L.
(2008). Understanding individual human mobility
patterns. Nature, 453(7196):779–782.
Hashimoto, K., Stenetorp, P., Miwa, M., and Tsuruoka, Y.
(2015). Task-oriented learning of word embeddings
for semantic relation classification. CoNLL 2015 -
19th Conference on Computational Natural Language
Learning, Proceedings, pages 268–278.
Jeung, H., Lu, H., Sathe, S., and Yiu, M. L. (2014). Manag-
ing evolving uncertainty in trajectory databases. IEEE
Transactions on Knowledge and Data Engineering,
26(7):1692–1705.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J.,
Girshick, R., Guadarrama, S., and Darrell, T. (2014).
Caffe: Convolutional architecture for fast feature em-
bedding. arXiv preprint arXiv:1408.5093.
Kim, Y. (2014). Convolutional neural networks for sentence
classification. EMNLP 2014 - 2014 Conference on
Empirical Methods in Natural Language Processing,
Proceedings of the Conference, pages 1746–1751.
Kiukkonen, N., Blom, J., Dousse, O., Gatica-perez, D.,
and Laurila, J. (2010). Towards rich mobile phone
datasets: Lausanne data collection campaign.
Kotz, D., Henderson, T., and McDonald, C. (2005). Craw-
dad archive: a community resource for archiving wire-
less data at dartmouth. Web site. Archive available
from https://www.crawdad.org (accessed on 1 Febru-
ary 2021).
Kretowicz, W. and Biecek, P. (2020). MementoML: Perfor-
mance of selected machine learning algorithm config-
urations on OpenML100 datasets. pages 1–7.
Laurila, J. K., Gatica-Perez, D., Aad, I., Blom, J., Bornet,
O., Do, T. M. T., Dousse, O., Eberle, J., and Miettinen,
M. (2013). From big smartphone data to worldwide
research: The Mobile Data Challenge. Pervasive and
Mobile Computing, 9(6):752–771.
Microsoft Azure (2017). The Microsoft Cogni-
tive Toolkit - Cognitive Toolkit - CNTK —
Microsoft Docs. Software available from
https://github.com/Microsoft/CNTK (accessed on
1 February 2021).
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013).
Efficient estimation of word representations in vector
space. 1st International Conference on Learning Rep-
resentations, ICLR 2013 - Workshop Track Proceed-
ings, pages 1–12.
Miranda, C. S. and Von Zuben, F. J. (2015). Reducing the
Training Time of Neural Networks by Partitioning.
Molino, P., Wang, Y., and Zhang, J. (2019). Parallax: Vi-
sualizing and understanding the semantics of embed-
ding spaces via algebraic formulae. ACL 2019 - 57th
Annual Meeting of the Association for Computational
Linguistics, Proceedings of System Demonstrations,
pages 165–180.
Reif, M., Shafait, F., and Dengel, A. (2011). Prediction of
classifier training time including parameter optimiza-
tion. Lecture Notes in Computer Science (including
Using Syntactic Similarity to Shorten the Training Time of Deep Learning Models using Time Series Datasets: A Case Study
99