Code Size and Accuracy-aware Synthesis of Fixed-point Programs for Matrix Multiplication

Matthieu Martel, Amine Najahi, Guillaume Revy

Abstract

In digital signal processing, many primitives boil down to a matrix multiplication. In order to enable savings in time, energy consumption, and on-chip area, these primitives are often implemented in fixed-point arithmetic. Various conflicting goals undermine the process of writing fixed-point codes, such as numerical accuracy, runtime latency, and size of the codes. In this article, we introduce a new methodology to automate the synthesis of small and accurate codes for matrix multiplication in fixed-point arithmetic. Our approach relies on a heuristic to merge matrix rows or columns in order to reduce the synthesized code size, while guaranteeing a target accuracy. We suggest a merging strategy based on finding closest pairs of vectors, which makes it possible to address in a few seconds problems such as the synthesis of small and accurate codes for size-64 and more matrix multiplication. Finally, we illustrate its efficiency on a set of benchmarks, and we show that it allows to reduce the synthesized code size by more than 50% while maintaining good numerical properties.

References

  1. ALshebeili, S. A. (2001). Computation of higher-order cross moments based on matrix multiplication. Journal of the Franklin Institute, 338(7):811-816.
  2. Campbell, S. J. and Khatri, S. P. (2006). Resource and delay efficient matrix multiplication using newer FPGA devices. In Proceedings of the 16th ACM Great Lakes Symposium on VLSI, GLSVLSI 7806, pages 308-311, New York, NY, USA. ACM.
  3. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms (3. ed.). MIT Press.
  4. Golub, G. and Mitchell, I. (1998). Matrix factorizations in Fixed Point on the C6x VLIW architecture. Technical report, Stanford University, Standford, California, USA.
  5. Irturk, A., Benson, B., Mirzaei, S., and Kastner, R. (2010). GUSTO: An automatic generation and optimization tool for matrix inversion architectures. ACM Trans. Embed. Comput. Syst., 9(4):32:1-32:21.
  6. Kim, S., il Kum, K., and Sung, W. (1998). Fixed-point optimization utility for C and C++ based digital signal processing programs. In IEEE Trans. Circuits and Systems II, pages 1455-146.
  7. Lee, D.-U., Gaffar, A., Cheung, R. C. C., Mencer, O., Luk, W., and Constantinides, G. (2006). Accuracyguaranteed bit-width optimization. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 25(10):1990-2000.
  8. Lee, D.-U. and Villasenor, J. D. (2009). Optimized Custom Precision Function Evaluation for Embedded Processors. IEEE Transactions on Computers, 58(1):46-59.
  9. Mehlhose, M. and Schiffermüller, S. (2009). Efficient Fixed-Point Implementation of Linear Equalization for Cooperative MIMO Systems. 17th European Signal Processing Conference (EUSIPCO 2009).
  10. Melquiond, G. (2006). De l'arithmétique d'intervalles à la certification de programmes. PhD thesis, ÓNS Lyon.
  11. Moore, R. E., Kearfott, R. B., and Cloud, M. J. (2009). Introduction to Interval Analysis. SIAM.
  12. Mouilleron, C. (2011). Efficient computation with structured matrices and arithmetic expressions. PhD thesis, Univ. de Lyon - ENS de Lyon.
  13. Mouilleron, C., Najahi, A., and Revy, G. (2013). Automated Synthesis of Target-Dependent Programs for Polynomial Evaluation in Fixed-Point Arithmetic. Technical Report 13006.
  14. Mouilleron, C. and Revy, G. (2011). Automatic Generation of Fast and Certified Code for Polynomial Evaluation. In Proc. of the 20th IEEE Symposium on Computer Arithmetic (ARITH'20), Tuebingen, Germany.
  15. Nikolic, Z., Nguyen, H. T., and Frantz, G. (2007). Design and Implementation of Numerical Linear Algebra Algorithms on Fixed-Point DSPs. EURASIP J. Adv. Sig. Proc., 2007.
  16. Ogita, T., Rump, S. M., and Oishi, S. (2005). Accurate Sum and Dot Product. SIAM J. Sci. Comput., 26(6):1955- 1988.
  17. Qasim, S. M., Abbasi, S., Alshebeili, S., Almashary, B., and Khan, A. A. (2008). FPGA Based Parallel Architecture for the Computation of Third-Order Cross Moments. International Journal of Computer, Information and Systems Science, and Engineering, 2(3):216- 220.
  18. Rump, S. M. (2009). Ultimately Fast Accurate Summation. SIAM J. Sci. Comput., 31(5):3466-3502.
  19. Shamos, M. I. and Hoey, D. (1975). Closest-point problems. In FOCS, pages 151-162.
  20. Sotiropoulos, I. and Papaefstathiou, I. (2009). A fast parallel matrix multiplication reconfigurable unit utilized in face recognitions systems. In Field Programmable Logic and Applications, 2009. FPL 2009. International Conference on, pages 276-281.
  21. Syed M. Qasim, Ahmed A. Telba, A. Y. A. (2010). FPGA Design and Implementation of Matrix Multiplier Architectures for Image and Signal Processing Applications. International Journal of Computer Science and Network Security, 10(2):168-176.
  22. Yates, R. (2009). Fixed-Point Arithmetic: An Introduction. Digital Signal Labs.
Download


Paper Citation


in Harvard Style

Martel M., Najahi A. and Revy G. (2014). Code Size and Accuracy-aware Synthesis of Fixed-point Programs for Matrix Multiplication . In Proceedings of the 4th International Conference on Pervasive and Embedded Computing and Communication Systems - Volume 1: PECCS, ISBN 978-989-758-000-0, pages 204-214. DOI: 10.5220/0004884802040214


in Bibtex Style

@conference{peccs14,
author={Matthieu Martel and Amine Najahi and Guillaume Revy},
title={Code Size and Accuracy-aware Synthesis of Fixed-point Programs for Matrix Multiplication},
booktitle={Proceedings of the 4th International Conference on Pervasive and Embedded Computing and Communication Systems - Volume 1: PECCS,},
year={2014},
pages={204-214},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004884802040214},
isbn={978-989-758-000-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Conference on Pervasive and Embedded Computing and Communication Systems - Volume 1: PECCS,
TI - Code Size and Accuracy-aware Synthesis of Fixed-point Programs for Matrix Multiplication
SN - 978-989-758-000-0
AU - Martel M.
AU - Najahi A.
AU - Revy G.
PY - 2014
SP - 204
EP - 214
DO - 10.5220/0004884802040214