Hi,
I have a situation like:
#pragma omp parallel for private(i, j, k, val, p, l)
for (i = 0; i < num1; i++)
{
for (j = 0; j < num2; j++)
{
for (k = 0; k < num3; k++)
{
val = m[i + j*somenum + k*2]
if (val != 0)
for (l = start; l <= end; l++)
{
someFunctionThatWritesIntoGlobalArray((i + l), j, k, (someFunctionThatGetsValueFromAnotherArray((i + l), j, k) * val));
}
}
}
for (p = 0; p < num4; p++)
{
m[p] = 0;
}
}
Thanks for reading, phew! Well I am noticing a very minor difference in the results (0.999967[omp] against 1[serial]), when I use the above (which is 3 times faster) against the serial implementation. Now I know I am doing a mistake here...especially the connection between loops is evident.
Is it possible to parallelize this using omp sections? I tried some options like making shared(p) {doing this, I got correct values, as in the serial form}, but there was no speedup then.
Any general advice on handling openmp pragmas over a slew of for loops would also be great for me!