Improve C function performance with cache locality?
- by Christoper Hans
I have to find a diagonal difference in a matrix represented as 2d array and the function prototype is
int diagonal_diff(int x[512][512])
I have to use a 2d array, and the data is 512x512. This is tested on a SPARC machine: my current timing is 6ms but I need to be under 2ms.
Sample data:
[3][4][5][9]
[2][8][9][4]
[6][9][7][3]
[5][8][8][2]
The difference is:
|4-2| + |5-6| + |9-5| + |9-9| + |4-8| + |3-8| = 2 + 1 + 4 + 0 + 4 + 5 = 16
In order to do that, I use the following algorithm:
int i,j,result=0;
for(i=0; i<4; i++)
for(j=0; j<4; j++)
result+=abs(array[i][j]-[j][i]);
return result;
But this algorithm keeps accessing the column, row, column, row, etc which make inefficient use of cache.
Is there a way to improve my function?