Discussion:
[PyCUDA] PyCuda Example question
Keith Brown
2015-11-10 16:09:07 UTC
Permalink
I am trying to learn the
http://wiki.tiker.net/PyCuda/Examples/MatrixmulSimple and its working
so far but for only smaller size matrix. When I increase the size of
the matrix the CPU and GPU values diverge as far as 5.9e+01.

I suspect its due to block and grid parameters I need to pass to
matrixmul(). Is that correct? How can I pick the most optimal values?

Or is there something else I should be considering?

My matrix size is 10000x3

Loading...