Keith Brown
2015-11-10 16:09:07 UTC
I am trying to learn the
http://wiki.tiker.net/PyCuda/Examples/MatrixmulSimple and its working
so far but for only smaller size matrix. When I increase the size of
the matrix the CPU and GPU values diverge as far as 5.9e+01.
I suspect its due to block and grid parameters I need to pass to
matrixmul(). Is that correct? How can I pick the most optimal values?
Or is there something else I should be considering?
My matrix size is 10000x3
http://wiki.tiker.net/PyCuda/Examples/MatrixmulSimple and its working
so far but for only smaller size matrix. When I increase the size of
the matrix the CPU and GPU values diverge as far as 5.9e+01.
I suspect its due to block and grid parameters I need to pass to
matrixmul(). Is that correct? How can I pick the most optimal values?
Or is there something else I should be considering?
My matrix size is 10000x3