[PyCUDA] Distributing GPU applications

Andreas Kloeckner

2014-09-08 18:13:19 UTC

Hi Marmaduke,

Post by marmaduke woodman
Does anyone have any experience or tips on distributing an application
using PyCUDA for users/computers that have a suitable GPU & driver but
otherwise unprepared for PyCUDA, i.e. not the full C++ compiler + CUDA
SDK toolchain?
Otherwise, I suspect the context cache is a place to start: I would
compile all possible kernels, persist the cache and at runtime load the
cache so that no compilation is necessary? Any information there upon
would be welcome.

I personally don't have such packaging experience, but I'd be very
interesting in hearing yours.

Here are some thoughts on this: While making the code cache take care of
this seems reasonable at first, I'd probably suggest adding a secondary
cache-like layer with somewhat more positive control over this to
SourceModule. Specifically, it seems reasonable to add an extra
parameter for a kernel identifier, along with (likely) a global variable
that sets where kernels are stored. Then, SourceModule could run in one
of two modes: First, "Generation", where the set directory would be
populated with CUBINs for all known/supported values of the compute
capability/shader model. Second, "retrieval", where it is forced to use
CUBINs from that directory. I imagine that this would be more robust, as
lots of things enter into cache key generation (Python versions,
Compiler versions, headers) that could easily lead to unintended cache
misses.

HTH,
Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 810 bytes
Desc: not available
URL: <http://lists.tiker.net/pipermail/pycuda/attachments/20140908/ec036e2e/attachment.sig>