Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling
the knowledge in a neural network. arXiv preprint
arXiv:1503.02531.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D.,
Wang, W., Weyand, T., Andreetto, M., and Adam,
H. (2017). Mobilenets: Efficient convolutional neu-
ral networks for mobile vision applications. arXiv
preprint arXiv:1704.04861.
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., and
Bengio, Y. (2016). Binarized neural networks. In Lee,
D., Sugiyama, M., Luxburg, U., Guyon, I., and Gar-
nett, R., editors, Advances in Neural Information Pro-
cessing Systems, volume 29. Curran Associates, Inc.
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K.,
Dally, W. J., and Keutzer, K. (2016). Squeezenet:
Alexnet-level accuracy with 50x fewer parame-
ters and¡ 0.5 mb model size. arXiv preprint
arXiv:1602.07360.
Kersner, M. (2019). Convolutional network without multi-
plication operation.
Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple
layers of features from tiny images.
Limonova, E., Alfonso, D., Nikolaev, D., and Arlazarov,
V. V. (2020a). Resnet-like architecture with low hard-
ware requirements. arXiv preprint arXiv:2009.07190.
Limonova, E., Matveev, D., Nikolaev, D., and Arlazarov,
V. V. (2020b). Bipolar morphological neural net-
works: convolution without multiplication. In Twelfth
International Conference on Machine Vision (ICMV
2019), volume 11433, page 114333J. International
Society for Optics and Photonics.
Mallah, M. (2018). Multiplication free neural networks.
PhD thesis, Bilkent University.
Mobahi, H. (2016). Training recurrent neural networks by
diffusion. arXiv preprint arXiv:1601.04114.
Mondal, R., Santra, S., and Chanda, B. (2019). Dense mor-
phological network: An universal function approxi-
mator.
NVIDIA, Vingelmann, P., and Fitzek, F. H. (2020). Cuda,
release: 10.2.89.
Pathak, H. N. and Paffenroth, R. (2019). Parameter continu-
ation methods for the optimization of deep neural net-
works. In 2019 18th IEEE International Conference
On Machine Learning And Applications (ICMLA),
pages 1637–1643. IEEE.
Razlighi, M. S., Imani, M., Koushanfar, F., and Rosing, T.
(2017). Looknn: Neural network with no multiplica-
tion. In Design, Automation & Test in Europe Confer-
ence & Exhibition (DATE), 2017, pages 1775–1780.
IEEE.
Reed, R. (1993). Pruning algorithms-a survey. IEEE trans-
actions on Neural Networks, 4(5):740–747.
Tan, M. and Le, Q. V. (2019). Efficientnet: Rethink-
ing model scaling for convolutional neural networks.
arXiv preprint arXiv:1905.11946.
Wu, C.-J., Brooks, D., Chen, K., Chen, D., Choudhury, S.,
Dukhan, M., Hazelwood, K., Isaac, E., Jia, Y., Jia, B.,
et al. (2019). Machine learning at facebook: Under-
standing inference at the edge. In 2019 IEEE Inter-
national Symposium on High Performance Computer
Architecture (HPCA), pages 331–344. IEEE.
Wu, H., Judd, P., Zhang, X., Isaev, M., and Micikevicius, P.
(2020). Integer quantization for deep learning infer-
ence: Principles and empirical evaluation.
Xu, Y., Xu, C., Chen, X., Zhang, W., Xu, C., and Wang, Y.
(2020). Kernel based progressive distillation for adder
neural networks. arXiv preprint arXiv:2009.13044.
Yim, J., Joo, D., Bae, J., and Kim, J. (2017). A gift
from knowledge distillation: Fast optimization, net-
work minimization and transfer learning. In Proceed-
ings of the IEEE Conference on Computer Vision and
Pattern Recognition, pages 4133–4141.
You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang,
Z., and Lin, Y. (2020). Shiftaddnet: A hardware-
inspired deep network. In Larochelle, H., Ranzato,
M., Hadsell, R., Balcan, M., and Lin, H., editors, Ad-
vances in Neural Information Processing Systems 33:
Annual Conference on Neural Information Processing
Systems 2020, NeurIPS 2020, December 6-12, 2020,
virtual.
Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018). Shuf-
flenet: An extremely efficient convolutional neural
network for mobile devices. In Proceedings of the
IEEE conference on computer vision and pattern
recognition, pages 6848–6856.
APPENDIX
7.3 Hardware Details
We compute the number of logic gates required for
each integer operation.
7.4 Addition
A half-adder (HA) circuit is made up of 1 XOR gate
and 1 AND gate, while the full-adder (FA) circuit re-
quires 2 XOR gates, 2 AND gates and 1 OR gate.
Therefore, the cost of an n bit addition is
HA + (n − 1) × FA
= (1 XOR + 1 AND) + (n − 1) × (2 XOR + 2 AND + 1 OR)
= (2n − 1) AND + (2n − 1) XOR + (n − 1) OR
≈ 5n − 3
7.5 Multiplication
A common architecture usually include (n − 1) n-bit
Adders besides the n
2
AND gates, see Figure 5 top
panels. One n-bit adders is composed of one half-
adder (HA) and n − 1 full-adder (FA). We will con-
sider a n-bit adder as building block in our theoretical
analysis, although it could be optimized further.
EuclidNets: Combining Hardware and Architecture Design for Efficient Training and Inference
149