Bruno Villasenor
2015-05-31 01:58:32 UTC
Hello pycuda-users:
Iâm trying to compile the simplest code that uses dynamic parallelism using
the regular SorceModule, my code:
------------------------------------------------------------------------
import numpy as np
import pycuda.driver as cuda
from pycuda.compiler import SourceModule
import pycuda.autoinit
cudaCodeString = """
__global__ void ChildKernel(void* data){
//Operate on data
}
__global__ void ParentKernel(void *data){
if (threadIdx.x == 0) {
ChildKernel<<<1, 32>>>(data);
cudaThreadSynchronize();
}
__syncthreads();
//Operate on data
}
"""
cudaCode = SourceModule(cudaCodeString, options=['-rdc=true' ,'-lcudart' ],
arch='compute_35' )
-------------------------------------------------------------------------------
I get the next error:
---------------------------------------------------------------------------------
pycuda.driver.CompileError: nvcc compilation of /tmp/tmpJJo9kU/kernel.cu
failed
[command: nvcc --cubin -rdc=true -lcudart -arch compute_35 -I/usr
/local/lib/python2.7/dist-packages/pycuda-2014.1-py2.7-linux-x86_64.egg/
pycuda/cuda kernel.cu]
[stderr:
nvcc fatal : Option '--cubin (-cubin)' is not allowed when compiling for
a virtual compute architecture
-----------------------------------------------------------------------------------
CUDA version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Wed_Jul_17_18:36:13_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0
Driver version: 331.38
--------------------------------------------------------------------------------------
Any ideas?
Is anyone successfully using dynamic parallelism with pycuda?
Thanks in advance.
Bruno
Iâm trying to compile the simplest code that uses dynamic parallelism using
the regular SorceModule, my code:
------------------------------------------------------------------------
import numpy as np
import pycuda.driver as cuda
from pycuda.compiler import SourceModule
import pycuda.autoinit
cudaCodeString = """
__global__ void ChildKernel(void* data){
//Operate on data
}
__global__ void ParentKernel(void *data){
if (threadIdx.x == 0) {
ChildKernel<<<1, 32>>>(data);
cudaThreadSynchronize();
}
__syncthreads();
//Operate on data
}
"""
cudaCode = SourceModule(cudaCodeString, options=['-rdc=true' ,'-lcudart' ],
arch='compute_35' )
-------------------------------------------------------------------------------
I get the next error:
---------------------------------------------------------------------------------
pycuda.driver.CompileError: nvcc compilation of /tmp/tmpJJo9kU/kernel.cu
failed
[command: nvcc --cubin -rdc=true -lcudart -arch compute_35 -I/usr
/local/lib/python2.7/dist-packages/pycuda-2014.1-py2.7-linux-x86_64.egg/
pycuda/cuda kernel.cu]
[stderr:
nvcc fatal : Option '--cubin (-cubin)' is not allowed when compiling for
a virtual compute architecture
-----------------------------------------------------------------------------------
CUDA version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Wed_Jul_17_18:36:13_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0
Driver version: 331.38
--------------------------------------------------------------------------------------
Any ideas?
Is anyone successfully using dynamic parallelism with pycuda?
Thanks in advance.
Bruno