C++ AMP, for loops to parallel_for_each loop
- by user1430335
I'm converting an algorithm to make use of the massive acceleration that C++ AMP provides. The stage I'm at is putting the for loops into the known parallel_for_each loop.
Normally this should be a straightforward task to do but it appears more complex then I first thought. It's a nested loop which I increment using steps of 4 per iterations:
for(int j = 0; j < height; j += 4, data += width * 4 * 4)
{
for(int i = 0; i < width; i += 4)
{
The trouble I'm having is the use of the index. I can't seem to find a way to properly fit this into the parallel_for_each loop. Using an index of rank 2 is the way to go but manipulating it via branching will do harm to the performance gain.
I found a similar post: Controlling the index variables in C++ AMP. It also deals about index manipulation but the increment aspect doesn't cover my issue.
With kind regards,
Forcecast