Authors:
Sammy Rogmans
1
;
Philippe Bekaert
2
and
Gauthier Lafruit
3
Affiliations:
1
Hasselt University – tUL – IBBT, Expertise centre for Digital Media; Multimedia Group, IMEC, Belgium
;
2
Hasselt University – tUL – IBBT, Expertise centre for Digital Media, Belgium
;
3
Multimedia Group, IMEC, Belgium
Keyword(s):
High-level, Transformation, Rule set, GPU, Efficient.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Cardiovascular Imaging and Cardiography
;
Cardiovascular Technologies
;
Health Engineering and Technology Applications
;
Image and Video Processing, Compression and Segmentation
;
Multimedia
;
Multimedia and Communications
;
Multimedia Signal Processing
;
Pattern Recognition
;
Performance Measurement and Evaluation, Qos.
;
Signal Processing
;
Software Engineering
;
Telecommunications
Abstract:
This paper proposes a high-level rule set that allows algorithmic designers to optimize their implementation on graphics hardware, with minimal design effort. The rules suggest possible kernel splits and merges to transform the kernels of the original design, resulting in an inter-kernel rather then low-level intra-kernel optimization. The rules consider both traditional texture caches and next-gen shared memory – which are used in the abstract stream-centric paradigms such as CUDA and Brook+ – and can therefore be implicitly applied in most generic streaming applications on graphics hardware.