such, precision reduction seems the be very promising
result, especially combined with FPGA implemen-
tation, which should lead to significant computation
speed-up and memory footprint reduction.
The precision reduction is also a good alternative
to dimensionality reduction by SVD. It can lead to
better accuracy. This feature is specially important for
scenarios with very large vocabularies and document
data sets. If SVD is still considered, the precision re-
duction should be applied before SVD, not in oppo-
site order. It should be also observed that focusing on
micro–averaged objective allows for stronger reduc-
tion than in macro-averaged measures. It should be
noticed that reduced precision in more complex algo-
rithms leads to higher probability of drop in accuracy
because the error of data representation is propagated
through longer computational path. Therefore KNN
gives the highest gain in accuracy after precision re-
duction.
The approach developed and described in this pa-
per enables porting NLP and VSM–based solutions
to FPGA or embeded devices with reduced mem-
ory capacity or reduced precision arithmetic. This is
done through reduction of the model memory foot-
print which results from low-bit vector representation.
It is worth noting that the reduced memory occupa-
tion also affect the performance of the system, es-
pecially the response latency which is critical in em-
bedded systems. Smaller vectors mean less computa-
tions which in turn leads to lower energy consump-
tion. Further analysis will concentrate on datasets
structures and theirs impact on reduction ability and
simulations with other quantized vector space models
(e.g. log tf, boolean).
Nowadays neural networks are one of the most
popular machine learning tools used to solve NLP
problems. Our further research will be focused on
testing precision reduction on distributional represen-
tations, which are typically used as inputs to neu-
ral networks. It is not uncommon that neural net-
works have millions of parameters (e.g. Alexnet,
Resnet 152, Inception Resnet) – the reduction of pre-
cision of the vector weights is an interesting direc-
tion of research, which will be pursued in our fu-
ture work. Comparative studies on compressed deep
learning models and reduced VSM representations
with machine learning model presented in this arti-
cle can show which method need less storage and be
run in less number of cycles without significant drop
in performance.
REFERENCES
Bengio, Y., Courville, A., and Vincent, P. (2013). Repre-
sentation Learning: A Review and New Perspectives.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 35(8):1798–1828.
Bingham, E. and Mannila, H. (2001). Random projec-
tion in dimensionality reduction: Applications to im-
age and text data. In Proceedings of the Seventh
ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining, KDD ’01, pages
245–250, New York, NY, USA. ACM.
Cardoso-Cachopo, A. (2007). Improving Methods for
Single-label Text Categorization. PdD Thesis, Insti-
tuto Superior Tecnico, Universidade Tecnica de Lis-
boa.
Collobert, R., Weston, J., Bottou, L., Karlen, M.,
Kavukcuoglu, K., and Kuksa, P. (2011). Natural lan-
guage processing (almost) from scratch. J. Mach.
Learn. Res., 12:2493–2537.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer,
T. K., and Harshman, R. (1990a). Indexing by la-
tent semantic analysis. JOURNAL OF THE AMER-
ICAN SOCIETY FOR INFORMATION SCIENCE,
41(6):391–407.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer,
T. K., and Harshman, R. (1990b). Indexing by latent
semantic analysis. Journal of the American society for
information science, 41(6):391.
Hawkins, J. and Blakeslee, S. (2004). On Intelligence.
Times Books.
Hawkins, J. and George, D. (2006). Hierarchical temporal
memory: Concepts, theory and terminology. Techni-
cal report, Numenta.
Hermann, K. M., Das, D., Weston, J., and Ganchev, K.
(2014). Semantic frame identification with distributed
word representations. In Proceedings of ACL. Associ-
ation for Computational Linguistics.
Karwatowski, M., Russek, P., Wielgosz, M., Koryciak, S.,
and Wiatr, K. (2015). Energy efficient calculations of
text similarity measure on FPGA-accelerated comput-
ing platforms. Lecture Notes in Computer Science,
pages 31 – 40.
Karwatowski, M., Wielgosz, M., Pietro´n, M., and Wiatr, K.
(2017). Comparison of semantic vectors with reduced
precision using the cosine similarity measure.
Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury,
J., Gulrajani, I., Zhong, V., Paulus, R., and Socher, R.
(2015). Ask me anything: Dynamic memory networks
for natural language processing.
Manning, C. D. and Sch¨utze, H. (1999). Foundations of
Statistical Natural Language Processing. MIT Press,
Cambridge, MA, USA.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013a).
Efficient estimation of word representations in vector
space. CoRR, abs/1301.3781.
Mikolov, T., Yih, W., and Zweig, G. (2013b). Linguistic
regularities in continuous space word representations.
In Human Language Technologies: Conference of the