Discussion:
[PyCUDA] Newbie Question concerning data transfer and kernel compilation
Yuan Chen
2016-02-22 16:40:33 UTC
Permalink
Hi,

I just start to use pycuda to do some gpu computing.

However, I found that transfering numpy arrays to gpu costs a lot of time
and so does compiling the source.

I am using the SourceModule now and as far as I know, for example, I have a
file called try.py and a kernel function called searching(float *arr), the
question is

1) Everytime I run the try.py, the searching function is compiled once,
and cached later until the codes end. So I am wondering if I can
perminantly save that function and load the saved function so that I don't
have to compile it when I run the script.

2) Is there a way that make transfering data faster? I read the documents,
is the managed memory gonna help with this?


Thanks a lot for help.

Best Regards,

Yuan Chen
Andreas Kloeckner
2016-02-22 16:56:59 UTC
Permalink
Post by Yuan Chen
Hi,
I just start to use pycuda to do some gpu computing.
However, I found that transfering numpy arrays to gpu costs a lot of time
and so does compiling the source.
I am using the SourceModule now and as far as I know, for example, I have a
file called try.py and a kernel function called searching(float *arr), the
question is
1) Everytime I run the try.py, the searching function is compiled once,
and cached later until the codes end. So I am wondering if I can
perminantly save that function and load the saved function so that I don't
have to compile it when I run the script.
PyCUDA caches the binaries for your source code as much as possible. So
once you compile the same code a second time, SourceModule construction
should be quite fast. Are you finding otherwise?
Post by Yuan Chen
2) Is there a way that make transfering data faster? I read the documents,
is the managed memory gonna help with this?
Read about page-locked host memory. Those transfers are a fair bit
faster than non-page-locked ones, since the hardware can do them on its
own.

Andreas

Loading...