[PyCUDA] How to debug "cuMemcpyDtoH failed"

Discussion:

Lu, Xinghua

2015-05-08 13:53:15 UTC

Dear all,

I am new to pyCuda, and I would appreciate your help in advance.

I were able to write a few short pyCuda code but run into a roadblock with one at my hand.

The code snippet is as follows:

tumorLnFScore = np.zeros((nTumorMutGenes, nTumorDEGs)).astype(np.float32)
gpu_tumorLnFScore = cuda.mem_alloc(tumorLnFScore.nbytes)
## bunch of other initialization of GPU variables for func call

func = mod.get_function("PanCanTDIMarginalGPU")
func(gpu_mutcnaMatrix, gpu_degMatrix, gpu_nTumors, gpu_tumormutGeneIndx, gpu_nTumorGTs,\
gpu_degGeneIndx, gpu_nTumorDEGs, gpu_tumorLnFScore, gpu_cancerTypeColIndx,\
gpu_ge1stDriverIndices, gpu_ge2ndDriverIndices, block=(blocksize, 1, 1), grid=(nBlockInGrid, 1))

cuda.memcpy_dtoh(tumorLnFScore, gpu_tumorLnFScore)

However, pyCuda returned the following error:

File "/home/kevin/GroupDropbox/TDI/PanCanTDIGPU.py", line 421, in calcPanCanTDIGPU

cuda.memcpy_dtoh(tumorLnFScore, gpu_tumorLnFScore)

pycuda._driver.LogicError: cuMemcpyDtoH failed: invalid/unknown error code

PyCUDA WARNING: a clean-up operation failed (dead context maybe?)

cuMemFree failed: invalid/unknown error code

My question is:

What would be most common cause of the above error, on pycuda side or in Cuda C/C++ side? Thanks in advance for helping.

Best,
Xinghua

--

Xinghua Lu,

Andreas Kloeckner

2015-05-09 02:51:06 UTC

Permalink

Post by Lu, Xinghua
I am new to pyCuda, and I would appreciate your help in advance.
I were able to write a few short pyCuda code but run into a roadblock with one at my hand.
tumorLnFScore = np.zeros((nTumorMutGenes, nTumorDEGs)).astype(np.float32)
gpu_tumorLnFScore = cuda.mem_alloc(tumorLnFScore.nbytes)
## bunch of other initialization of GPU variables for func call
func = mod.get_function("PanCanTDIMarginalGPU")
func(gpu_mutcnaMatrix, gpu_degMatrix, gpu_nTumors, gpu_tumormutGeneIndx, gpu_nTumorGTs,\
gpu_degGeneIndx, gpu_nTumorDEGs, gpu_tumorLnFScore, gpu_cancerTypeColIndx,\
gpu_ge1stDriverIndices, gpu_ge2ndDriverIndices, block=(blocksize, 1, 1), grid=(nBlockInGrid, 1))
cuda.memcpy_dtoh(tumorLnFScore, gpu_tumorLnFScore)
File "/home/kevin/GroupDropbox/TDI/PanCanTDIGPU.py", line 421, in calcPanCanTDIGPU
cuda.memcpy_dtoh(tumorLnFScore, gpu_tumorLnFScore)
pycuda._driver.LogicError: cuMemcpyDtoH failed: invalid/unknown error code
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: invalid/unknown error code
What would be most common cause of the above error, on pycuda side or
in Cuda C/C++ side? Thanks in advance for helping.

You probably have a bug in one of your kernel codes, which caused a
segfault on the GPU, as a result of which your GPU context got killed.

HTH,
Andreas

x***@pitt.edu

2015-05-09 11:12:16 UTC

Permalink

Thanks, Andreas.

I find the region of the code causing problem, which puzzles me even more
because it is a simple matrix look up. The question is: under what
situation a memory read/write cause segfault? Does Cuda keep track of array
size? Thanks in advance for helping.

Best,
Xinghua

The code is as follows:

// vectors to hold tumors with GT == 1 and GT == 0
int* vecTumorWithGT = new int[nTumors];
int* vecTumorWithoutGT = new int[nTumors];

if (myThreadID < nTumorDEGs)
{
// loop through GTs rows
for (int g = 0; g < nTumorGTs; g++)
{
int curGTIndx = tumormutGeneIndx[g]; // look up row

// scan rows with a particular GT == 1 and 0
int nTumorGTOnes = 0;
int nTumorGTZeros = 0;
int myRowStart = nTumors * curGTIndx;
//scan through cols
for (int n = 0; n < nTumors; n++)
{
if (mutcnaMatrix[ myRowStart + n] == 1){
vecTumorWithGT[nTumorGTOnes++] = n;
}
else
vecTumorWithoutGT[nTumorGTZeros++] = n;

}
}

--
View this message in context: http://pycuda.2962900.n2.nabble.com/PyCUDA-How-to-debug-cuMemcpyDtoH-failed-tp7575572p7575574.html
Sent from the PyCuda mailing list archive at Nabble.com.

Andreas Kloeckner

2015-05-09 16:48:48 UTC

Permalink

Post by x***@pitt.edu
I find the region of the code causing problem, which puzzles me even more
because it is a simple matrix look up. The question is: under what
situation a memory read/write cause segfault? Does Cuda keep track of array
size? Thanks in advance for helping.

I'm a little fuzzy on the details, but I imagine the GPU has some sort
of page-based memory protection scheme just like the CPU.

Andreas