CUDA Kernel Not Updating Global Variable
- by Taher Khokhawala
I am facing the following problem in a CUDA kernel. There is an array "cu_fx" in global memory. Each thread has a unique identifier jj and a local loop variable ii and a local float variable temp.
Following code is not working. It is not at all changing cu_fx[jj]. At the end of loop cu_fx[jj] remains 0.
ii = 0;
cu_fx[jj] = 0;
while(ii < l)
{
if(cu_y[ii] > 0)
cu_fx[jj] += (cu_mu[ii]*cu_Kernel[(jj-start_row)*Kernel_w + ii]);
else
cu_fx[jj] -= (cu_mu[ii]*cu_Kernel[(jj-start_row)*Kernel_w + ii]);
ii++;
}
But when I rewrite it using a temporary variable temp, it works fine.
ii = 0;
temp = 0;
while(ii < l)
{
if(cu_y[ii] > 0)
temp += (cu_mu[ii]*cu_Kernel[(jj-start_row)*Kernel_w + ii]);
else
temp -= (cu_mu[ii]*cu_Kernel[(jj-start_row)*Kernel_w + ii]);
ii++;
}
cu_fx[jj] = temp;
Can somebody please help with this problem. Thanking in advance.