Andreas Kloeckner
2016-08-29 15:17:04 UTC
Hi Andreas​,
I am a former student of your CS 450 and now I am a incoming PhD student in
operations research at Northwestern.
Since I am interested in applying parallel computing, preferably using
Python, to my future research, I have been looking for software which
combines Python with CUDA. Then I found PyCUDA on your website. And I found
NumbaPro. It seems that these two are the most popular choices for people
with needs like mine.
So my question is: which one do I begin to learn and use first? Could you
give some comments on pros and cons about the two?
Cc'ing the PyCUDA list for archival/searchability.I am a former student of your CS 450 and now I am a incoming PhD student in
operations research at Northwestern.
Since I am interested in applying parallel computing, preferably using
Python, to my future research, I have been looking for software which
combines Python with CUDA. Then I found PyCUDA on your website. And I found
NumbaPro. It seems that these two are the most popular choices for people
with needs like mine.
So my question is: which one do I begin to learn and use first? Could you
give some comments on pros and cons about the two?
- PyCUDA lets you/forces you to write CUDA C for your kernels.
- Numba lets you write (a narrow subset of) Python for your kernels,
including arrays I believe.
- The code you write for both will be roughly equivalent modulo
spelling, since you'll have to
- PyCUDA exposes (nearly) the entire CUDA runtime, including streams,
profiling, textures, ... Numba is more restricted.
- PyCUDA comes with an on-device array type. I'm not sure if Numba's
arrays stay on-device after the computation finishes--i.e. you may
have some implicit copying.
- PyCUDA comes with some pre-made parallel algorithms such as scans
and reductions.
- You may also want to take a look at
- https://documen.tician.de/pyopencl/
- https://documen.tician.de/loopy/
Hope that helps,
Andreas