Compile OpenCL Kernel into LLVM-IR or Nvidia PTX
I'm writing this post when LLVM is up to 3.7 as its latest release. And libclc supports upto OpenCL 1.1 standard.
Environment:
For those who are real beginners, please refer to the Getting Started with the LLVM System to build and install LLVM/Clang
Now you'd have LLVM/Clang installed, so let's install libclc.
git clone http://llvm.org/git/libclc.git
cd libclc
./configure.py
make -j4
make install
Grab a test OpenCL kernel, like this one:
// matvec.cl
__kernel void matvec_mult(__global float4* matrix,
__global float4* vector,
__global float* result) {
int i = get_global_id(0);
result[i] = dot(matrix[i], vector[0]);
}
Use the following command to compile it.
clang -Dcl_clang_storage_class_specifiers -isystem libclc/generic/include -include clc/clc.h -target nvptx--nvidiacl -xcl matvec.cl -emit-llvm -S -o matvec.ll
Some explain about the command:
-
-target
is used to specify the target GPU hardware and driver, eg.-target nvptx--nvidiacl
or-target nvptx64--nvidiacl
or-target r600
(for AMD R600 GPU Driver), etc. -
-xcl
means let clang treatmatvec.cl
as OpenCL kernel file
And here we get the LLVM-IR from the OpenCL kernel.
Further, we'll compile it into Nvidia PTX.
The following command links the OpenCL Kernel with built-in implementations from libclc:
llvm-link /usr/local/lib/clc/nvptx--nvidiacl.bc matvec.ll -o matvec.linked.bc
(Remember to link the corresponding .bc
file, not always nvptx--nvidiacl.bc
)
Then, we'd use clang
to generate the final nvidia-ptx code using the following command:
clang -target nvptx--nvidiacl matvec.linked.bc -S -o matvec.nvptx.s
matvec.nvptx.s
is the final Nvidia PTX code.
Comments
Comments powered by Disqus