Simon Perkins
2014-10-10 12:45:43 UTC
Hi there
Would it be possible to add an allocator keyword argument to
ReductionKernel.__call__ and gpuarray.sum etc.?
At the moment we have:
krnl = ReductionKernel(...)
result = krnl(a, stream)
Now __call__() uses a.allocator to make device allocations, but unless a
has been allocated using a DeviceMemoryPool, a device allocation and
deallocation occurs for the returned value. Additionally, this serialises
asynchronous stream calls. One possible work-around is:
pool = pycuda.tools.DeviceMemoryPool()
tmp_alloc = a.allocator
a.allocator = pool.allocate
result = krnl(a, stream)
a.allocator = tmp_alloc
thanks!
Simon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tiker.net/pipermail/pycuda/attachments/20141010/63d8fd30/attachment.html>
Would it be possible to add an allocator keyword argument to
ReductionKernel.__call__ and gpuarray.sum etc.?
At the moment we have:
krnl = ReductionKernel(...)
result = krnl(a, stream)
Now __call__() uses a.allocator to make device allocations, but unless a
has been allocated using a DeviceMemoryPool, a device allocation and
deallocation occurs for the returned value. Additionally, this serialises
asynchronous stream calls. One possible work-around is:
pool = pycuda.tools.DeviceMemoryPool()
tmp_alloc = a.allocator
a.allocator = pool.allocate
result = krnl(a, stream)
a.allocator = tmp_alloc
thanks!
Simon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tiker.net/pipermail/pycuda/attachments/20141010/63d8fd30/attachment.html>