OPTIMIZING SKELETAL STREAM PROCESSING FOR DIVIDE AND CONQUER

Michael Poldner, Herbert Kuchen

2008

Abstract

Algorithmic skeletons intend to simplify parallel programming by providing recurring forms of program structure as predefined components. We present a new distributed task parallel skeleton for a very general class of divide and conquer algorithms for MIMD machines with distributed memory. Our approach combines skeletal internal task parallelism with stream parallelism. This approach is compared to alternative topologies for a task parallel divide and conquer skeleton with respect to their aptitude of solving streams of divide and conquer problems. Based on experimental results for matrix chain multiplication problems, we show that our new approach enables a better processor load and memory utilization of the engaged solvers, and reduces communication costs.

References

  1. A. Benoit, M. Cole, J. H. S. G. (2005). Flexible skeletal programming with eskel. In Proc. EuroPar 2005. LNCS 3648, 761-770, Springer Verlag, 2005.
  2. Aldinucci, M. and Danelutto, M. (1999). Stream parallel skeleton optimization. In Proceedings of the 11th IASTED International Conference on Parallel and Distributed Computing and Systems, MIT, Boston, USA. IASTED/ACTA press.
  3. Baumgartner, G. (2002). A high-level approach to synthesis of high-performance codes for quantum chemistry.
  4. Bischof, H. (2005). Systematic development of parallel programs using skeletons. In PhD thesis. Shaker.
  5. Cole, M. (1989). Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press.
  6. Cole, M. (1997). On dividing and conquering independently. In in Proceedings of Euro-Par'97, LNCS 1300, pages 634-637. Springer.
  7. Cole, M. (2004). Bringing skeletons out of the closet: A pragmatic manifesto for skeletal parallel programming. In Parallel Computing 30(3), 389-406.
  8. E. Alba, F. Almeida, e. a. (2002). Mallba: A library of skeletons for combinatorial search. In Proc. Euro-Par 2002. LNCS 2400, 927-932, Springer Verlag, 2005.
  9. G. H. Botorog, H. K. (1996). Efficient parallel programming with algorithmic skeletons. In Proc. EuroPar'96. LNCS 1123, 718-731, Springer Verlag, 1996.
  10. Gorlatch, S. (1997). N-graphs: scalable topology and design of balanced divide-and-conquer algorithms. In Parallel Computing, 23(6), pages 687-698.
  11. H. Kuchen, M. C. (2002). The integration of task and data parallel skeletons. In Parallel Processing Letters 12(2), 141-155.
  12. Herrmann, C. (2000). The skeleton-based parallelization of divide-and-conquer recursions. In PhD thesis. Logos.
  13. J. Darlington, Y. Guo, H. J. Y. (1995). Parallel skeletons for structured composition. In in Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 19-28. ACM Press.
  14. J.R. González, C. León, C. R. (2004). A distributed parallel divide and conquer skeleton. In in proceedings of PARA'04).
  15. K. Matsuzaki, K. Emoto, H. I. Z. H. (2006). A library of constructive skeletons for sequential style of parallel programming. In in proceedings of 1st international Conference on Scalable Information Systems (INFOSCALE).
  16. Kuchen, H. (2002). A skeleton library. In Euro-Par'02. LNCS 2400, 620-629, Springer Verlag.
  17. Kuchen, H. (2004). Optimizing sequences of skeleton calls. In Domain-Specific Program Generation. LNCS 3016, 254-273, Springer Verlag.
  18. MPI (2008). Message passing interface forum, mpi. In MPI: A Message-Passing Interface Standard. http://www.mpi-forum.org/docs/mpi-11-html/mpireport.html.
  19. Myricom (2008). http://www.myri.com/.
  20. Pelagatti, S. (2003). Task and data parallelism in p3l. In Patterns and Skeletons for Parallel and Distributed Computing. eds. F.A. Rabhi, S. Gorlatch, 155-186, Springer Verlag.
  21. Poldner, M. and Kuchen, H. (2005). Scalable farms. In in proceedings of Parallel Computing (ParCo).
  22. Poldner, M. and Kuchen, H. (2006). Algorithmic skeletons for branch & bound. In in proceedings of 1st International Conference on Software and Data Technology (ICSOFT), Vol. 1, pages 291-300.
  23. Poldner, M. and Kuchen, H. (2008a). On implementing the farm skeleton. In Parallel Processing Letters, Vol. 18, No. 1, pages 117-131.
  24. Poldner, M. and Kuchen, H. (2008b). Skeletons for divide and conquer algorithms. In Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN). ACTA Press.
  25. Strassen, V. (1969). Gaussian elimination is not optimal. In Numerische Mathematik, 13:354-356.
  26. T.C. Hu, M. S. (1982). Computation of matrix chain products i. In SIAM Journal on Computing, 11(2):362-373.
  27. T.C. Hu, M. S. (1984). Computation of matrix chain products ii. In SIAM Journal on Computing, 13(2):228- 251.
  28. Y. Bai, R. W. (2007). A parallel symmetric blocktridiagonal divide-and-conquer algorithm. In ACM Transactions on Mathematical Software, Vol. 33, No. 4, Article 25.
  29. ZIV (2008). Ziv-cluster. http://zivcluster.uni-muenster.de/.
Download


Paper Citation


in Harvard Style

Poldner M. and Kuchen H. (2008). OPTIMIZING SKELETAL STREAM PROCESSING FOR DIVIDE AND CONQUER . In Proceedings of the Third International Conference on Software and Data Technologies - Volume 1: ICSOFT, ISBN 978-989-8111-51-7, pages 181-189. DOI: 10.5220/0001889301810189


in Bibtex Style

@conference{icsoft08,
author={Michael Poldner and Herbert Kuchen},
title={OPTIMIZING SKELETAL STREAM PROCESSING FOR DIVIDE AND CONQUER},
booktitle={Proceedings of the Third International Conference on Software and Data Technologies - Volume 1: ICSOFT,},
year={2008},
pages={181-189},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001889301810189},
isbn={978-989-8111-51-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Software and Data Technologies - Volume 1: ICSOFT,
TI - OPTIMIZING SKELETAL STREAM PROCESSING FOR DIVIDE AND CONQUER
SN - 978-989-8111-51-7
AU - Poldner M.
AU - Kuchen H.
PY - 2008
SP - 181
EP - 189
DO - 10.5220/0001889301810189