trouble calculating offset index into 3D array

Posted by Derek on Stack Overflow See other posts from Stack Overflow or by Derek
Published on 2011-01-29T23:24:15Z Indexed on 2011/01/29 23:25 UTC
Read the original article Hit count: 204

Hello,

I am writing a CUDA kernel to create a 3x3 covariance matrix for each location in the rows*cols main matrix. So that 3D matrix is rows*cols*9 in size, which i allocated in a single malloc accordingly. I need to access this in a single index value

the 9 values of the 3x3 covariance matrix get their values set according to the appropriate row r and column c from some other 2D arrays.

In other words - I need to calculate the appropriate index to access the 9 elements of the 3x3 covariance matrix, as well as the row and column offset of the 2D matrices that are inputs to the value, as well as the appropriate index for the storage array.

i have tried to simplify it down to the following:

   //I am calling this kernel with 1D blocks who are 512 cols x 1row. TILE_WIDTH=512
   int bx = blockIdx.x;
   int by = blockIdx.y;
   int tx = threadIdx.x;
   int ty = threadIdx.y;
   int r = by + ty; 
   int c = bx*TILE_WIDTH + tx;
   int offset = r*cols+c; 
   int ndx = r*cols*rows + c*cols;


   if((r < rows) && (c < cols)){ //this IF statement is trying to avoid the case where a threadblock went bigger than my original array..not sure if correct

      d_cov[ndx + 0] = otherArray[offset];
      d_cov[ndx + 1] = otherArray[offset]
      d_cov[ndx + 2] = otherArray[offset]
      d_cov[ndx + 3] = otherArray[offset]
      d_cov[ndx + 4] = otherArray[offset]  
      d_cov[ndx + 5] = otherArray[offset]  
      d_cov[ndx + 6] = otherArray[offset]
      d_cov[ndx + 7] = otherArray[offset]   
      d_cov[ndx + 8] = otherArray[offset]   
   }

When I check this array with the values calculated on the CPU, which loops over i=rows, j=cols, k = 1..9

The results do not match up.

in other words d_cov[i*rows*cols + j*cols + k] != correctAnswer[i][j][k]

Can anyone give me any tips on how to sovle this problem? Is it an indexing problem, or some other logic error?

© Stack Overflow or respective owner

Related posts about c++

Related posts about arrays