Post by Rasmus Diederichsen
Is it possible to use Reduction operations to reduce a 2-d array to a
1-d one, by e.g. computing the rowwise sum or some other operations?
So far I haven't been successful.
No--ReductionKernel is not meant for that. Its role is to do global
reductions when there is *no* other source of concurrency available. In
your situation, you can still parallelize over the non-summed axis,
which will lead to vastly more efficient code. As a downside, there
isn't really canned code to do that. But check out
It can help you write that kernel.