Discussion:
[PyCUDA] PyCUDA on Tesla k40c - out of resources
Alistair McDougall
2014-04-02 04:41:59 UTC
Permalink
Hi,
I'm have previously been using PyCUDA on a Tesla C2075 as part my of
astrophysics research. We recently installed a Tesla K40c and I was hoping
to just run the same code on the new card, however I am receiving "pycuda
._driver.LaunchError: cuLaunchKernel failed: launch out of resources"
errors.

A quick google search for "PyCUDA Tesla K40c" returned a minimal set of
results, which led me to wonder has anyone tried running PyCUDA on this
card?

As the PyCUDA code works fine on an older card, I am unsure why a newer
card would return an "out of resources" error, when it should have more
available resources. Is there an obvious solution I have overlooked as to
why the same code will not run on the new Tesla card?

Cheers,
Alistair
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tiker.net/pipermail/pycuda/attachments/20140402/d568f48d/attachment.html>
Andreas Kloeckner
2014-04-02 06:09:16 UTC
Permalink
Post by Alistair McDougall
I'm have previously been using PyCUDA on a Tesla C2075 as part my of
astrophysics research. We recently installed a Tesla K40c and I was hoping
to just run the same code on the new card, however I am receiving "pycuda
._driver.LaunchError: cuLaunchKernel failed: launch out of resources"
errors.
A quick google search for "PyCUDA Tesla K40c" returned a minimal set of
results, which led me to wonder has anyone tried running PyCUDA on this
card?
As the PyCUDA code works fine on an older card, I am unsure why a newer
card would return an "out of resources" error, when it should have more
available resources. Is there an obvious solution I have overlooked as to
why the same code will not run on the new Tesla card?
Things like number of registers and available shared memory per SM do
vary between cards, and it's quite possible that your old code uses too
much of a given resource for a given card. Shrinking the block size
tends to let the code run, at the expense of some performance.

HTH,
Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 810 bytes
Desc: not available
URL: <http://lists.tiker.net/pipermail/pycuda/attachments/20140402/47bd784d/attachment.sig>
Jerome Kieffer
2014-04-02 06:46:28 UTC
Permalink
On Wed, 2 Apr 2014 17:41:59 +1300
Post by Alistair McDougall
Hi,
I'm have previously been using PyCUDA on a Tesla C2075 as part my of
astrophysics research. We recently installed a Tesla K40c and I was hoping
to just run the same code on the new card, however I am receiving "pycuda
._driver.LaunchError: cuLaunchKernel failed: launch out of resources"
errors.
A quick google search for "PyCUDA Tesla K40c" returned a minimal set of
results, which led me to wonder has anyone tried running PyCUDA on this
card?
Hi,
I ran into similar bugs with our K20 and I was
scratching my head for a while when people from Nvidia told me that the
driver 319 from nvidia had problems with the GK110 based Tesla cards.
Driver 331 runs without glitches for a while now.

Hope this helps.
--
JérÎme Kieffer
tel +33 476 882 445
Stanley Seibert
2014-04-03 23:10:01 UTC
Permalink
As a general note: Once you sort out the resources issue, it is *very* important to retune your block and grid sizes after switching from compute capability 2.0 (Tesla C2075) to compute capability 3.x (Tesla K40c). When first switched my code to the new architecture, I saw almost no improvement or actual regressions in performance. It wasn't until I re-benchmarked different grid configurations that I discovered the problem.

In fact, I now sometimes include an auto-tuning stage in my CUDA programs to dynamically select from a range of reasonable block sizes based on runtime benchmarks of my important kernels.
Post by Jerome Kieffer
On Wed, 2 Apr 2014 17:41:59 +1300
Post by Alistair McDougall
Hi,
I'm have previously been using PyCUDA on a Tesla C2075 as part my of
astrophysics research. We recently installed a Tesla K40c and I was hoping
to just run the same code on the new card, however I am receiving "pycuda
._driver.LaunchError: cuLaunchKernel failed: launch out of resources"
errors.
A quick google search for "PyCUDA Tesla K40c" returned a minimal set of
results, which led me to wonder has anyone tried running PyCUDA on this
card?
Hi,
I ran into similar bugs with our K20 and I was
scratching my head for a while when people from Nvidia told me that the
driver 319 from nvidia had problems with the GK110 based Tesla cards.
Driver 331 runs without glitches for a while now.
Hope this helps.
--
JérÎme Kieffer
tel +33 476 882 445
_______________________________________________
PyCUDA mailing list
PyCUDA at tiker.net
http://lists.tiker.net/listinfo/pycuda
Loading...