INCLUDING IMPROVEMENT OF THE EXECUTION TIME IN A SOFTWARE ARCHITECTURE OF LIBRARIES WITH SELF-OPTIMISATION

Luis-Pedro García, Javier Cuenca, Domingo Giménez

2007

Abstract

The design of hierarchies of libraries helps to obtain modular and efficient sets of routines to solve problems of specific fields. An example is ScaLAPACK’s hierarchy in the field of parallel linear algebra. To facilitate the efficient execution of these routines, the inclusion of self-optimization techniques in the hierarchy has been analysed. The routines at a level of the hierarchy use information generated by routines from lower levels. But sometimes, the information generated at one level is not accurate enough to be used satisfactorily at higher levels, and a remodelling of the routines is necessary. A remodelling phase is proposed and analysed with a Strassen matrix multiplication.

References

  1. Carmo-Boratto, M. D., Giménez, D., and Vidal, A. M. (2006). Automatic parametrization on divide-andconquer algorithms. In proceedings of International Congress of Mathematicians.
  2. Caron, E., Desprez, F., and Suter, F. (2005). Parallel extension of a dynamic performance forecasting tool. Scalable Computing: Practice and Experience, 6(1):57- 69.
  3. Chen., Z., Dongarra, J., Luszczek, P., and Roche, K. (2004). LAPACK for clusters project: An example of self adapting numerical software. In proceedings of the HICSS 0478, page 90282.1.
  4. Cuenca, J., Giménez, D., and González, J. (2004). Architecture of an automatic tuned linear algebra library. Parallel Computing, 30(2):187-220.
  5. Cuenca, J., Giménez, D., and Martínez-Gallar, J. P. (2005). Heuristics for work distribution of a homogeneous parallel dynamic programming scheme on heterogeneous systems. Parallel Computing, 31:735-771.
  6. Dongarra, J., Croz, J. D., and Duff, I. S. (1988). A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Software, 14:1-17.
  7. Dongarra, J. and Eijkhout, V. (2002). Self-adapting numerical software for next generation applications. In ICL Technical Report, ICL-UT-02-07.
  8. Douglas, C. C., Heroux, M., Slishman, G., and Smith, R. M. (1994). GEMMW: A portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm. J. Comp. Phys., 110:1-10.
  9. Frigo, M. (1998). FFTW: An adaptive software architecture for the FFT. In proceedings of the ICASSP conference, volume 3, pages 1381-1384.
  10. Huss-Lederman, S., Jacobson, E. M., Tsao, A., Turnbull, T., and Johnson, J. R. (1996). Implementation of strassen's algorithm for matrix multiplication. In proceedings of Supercomputing 7896, page 32.
  11. Katagiri, T., Kise, K., Honda, H., and Yuba, T. (2003). FIBER: A generalized framework for auto-tuning software. Springer LNCS, 2858:146-159.
  12. Katagiri, T., Kise, K., Honda, H., and Yuba, T. (2005). ABCLib DRSSED: A parallel eigensolver with an autotuning facility. Parallel Computing, 32:231-250.
  13. Lastovetsky, A., Reddy, R., and Higgins, R. (2006). Building the functional performance model of a processor. In proceedings of the SAC'06, pages 23-27.
  14. Martínez-Gallar, J. P., Almeida, F., and Giménez, D. (2006). Mapping in heterogeneous systems with heuristical methods. In proceedings of PARA'06.
  15. Singer, B. and Veloso, M. (2000). Learning to predict performance from formula modeling and training data. In proceedings of the 17th International Conference on Mach. Learn., pages 887-894.
  16. Strassen, V. (1969). Gaussian elimination is not optimal. Numerische Mathematik, 3(14):354-356.
  17. Tanaka, T., Katagiri, T., and Yuba, T. (2006). d-spline based incremental parameter estimation in automatic performance tuning. In proceedings of the PARA'06.
  18. Vadhiyar, S. S., Fagg, G. E., and Dongarra, J. J. (2000). Automatically tuned collective operations. In proceedings of Supercomputing 2000, pages 3-13.
  19. Vuduc, R., Demmel, J. W., and Bilmes, J. (2001). Statistical models for automatic performance tuning. In proceedings of ICCS'01, LNCS, volume 2073, pages 117-126.
  20. Whaley, R. C., Petitet, A., and Dongarra, J. J. (2001). Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3-35.
  21. Wolski, R., Spring, N. T., and Hayes, J. (1999). The network weather sevice: a distributed resource performance forescasting service for metacomputing. Journal of Future Generation Computing System, 15(5- 6):757-768.
Download


Paper Citation


in Harvard Style

García L., Cuenca J. and Giménez D. (2007). INCLUDING IMPROVEMENT OF THE EXECUTION TIME IN A SOFTWARE ARCHITECTURE OF LIBRARIES WITH SELF-OPTIMISATION . In Proceedings of the Second International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-989-8111-06-7, pages 156-161. DOI: 10.5220/0001337501560161


in Bibtex Style

@conference{icsoft07,
author={Luis-Pedro García and Javier Cuenca and Domingo Giménez},
title={INCLUDING IMPROVEMENT OF THE EXECUTION TIME IN A SOFTWARE ARCHITECTURE OF LIBRARIES WITH SELF-OPTIMISATION},
booktitle={Proceedings of the Second International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2007},
pages={156-161},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001337501560161},
isbn={978-989-8111-06-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Second International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - INCLUDING IMPROVEMENT OF THE EXECUTION TIME IN A SOFTWARE ARCHITECTURE OF LIBRARIES WITH SELF-OPTIMISATION
SN - 978-989-8111-06-7
AU - García L.
AU - Cuenca J.
AU - Giménez D.
PY - 2007
SP - 156
EP - 161
DO - 10.5220/0001337501560161