Skip to main content

Compile OpenCL Kernel into LLVM-IR or Nvidia PTX

I'm writing this post when LLVM is up to 3.7 as its latest release. And libclc supports upto OpenCL 1.1 standard.

Environment:

For those who are real beginners, please refer to the Getting Started with the LLVM System to build and install LLVM/Clang

Now you'd have LLVM/Clang installed, so let's install libclc.

git clone http://llvm.org/git/libclc.git
cd libclc
./configure.py
make -j4
make install

Grab a test OpenCL kernel, like this one:

// matvec.cl
__kernel void matvec_mult(__global float4* matrix,
                          __global float4* vector,
                          __global float* result) {
   
   int i = get_global_id(0);
   result[i] = dot(matrix[i], vector[0]);
}

Use the following command to compile it.

clang -Dcl_clang_storage_class_specifiers -isystem libclc/generic/include -include clc/clc.h -target nvptx--nvidiacl -xcl matvec.cl -emit-llvm -S -o matvec.ll

Some explain about the command:

  • -target is used to specify the target GPU hardware and driver, eg. -target nvptx--nvidiacl or -target nvptx64--nvidiacl or -target r600 (for AMD R600 GPU Driver), etc.
  • -xcl means let clang treat matvec.cl as OpenCL kernel file

And here we get the LLVM-IR from the OpenCL kernel.

Further, we'll compile it into Nvidia PTX.

The following command links the OpenCL Kernel with built-in implementations from libclc:

llvm-link /usr/local/lib/clc/nvptx--nvidiacl.bc matvec.ll -o matvec.linked.bc

(Remember to link the corresponding .bc file, not always nvptx--nvidiacl.bc)

Then, we'd use clang to generate the final nvidia-ptx code using the following command:

clang -target nvptx--nvidiacl matvec.linked.bc -S -o matvec.nvptx.s

matvec.nvptx.s is the final Nvidia PTX code.

Comments

Comments powered by Disqus