GPGPU kernel implementation and refinement using Obsidian
Obsidian is a domain specific language for data-parallel programming on graphics processors (GPUs). It is embedded in the functional programming language Haskell. The user writes code using constructs familiar from Haskell (like map and reduce), recursion and some specially designed combinators for combining GPU programs. NVIDIA CUDA code is generated from these high level descriptions, and passed to the nvcc compiler . Currently, we consider only the generation of single kernels, and not their coordination. This paper is focussed on how the user should work with Obsidian, starting with an obviously correct (or welltested) description of the required function, and refining it by the introduction of constructs to give finer control of the computation on the GPU. For some combinators, this approach results in CUDA code with satisfactory performance, promising increased productivity, as the high level descriptions are short and uncluttered. But for other combinators, the performance of generated code is not yet satisfactory. Ways to tackle this problem and plans to integrate Obsidian with another higher-level embedded language for GPU programming in Haskell are briefly discussed.