Search Results

Search found 1 results on 1 pages for 'user878944'.

Page 1/1 | 1 

  • Parallelize code using CUDA [migrated]

    - by user878944
    If I have a code which takes struct variable as input and manipulate it's elements, how can I parallelize this using CUDA? void BackpropagateLayer(NET* Net, LAYER* Upper, LAYER* Lower) { INT i,j; REAL Out, Err; for (i=1; i<=Lower->Units; i++) { Out = Lower->Output[i]; Err = 0; for (j=1; j<=Upper->Units; j++) { Err += Upper->Weight[j][i] * Upper->Error[j]; } Lower->Error[i] = Net->Gain * Out * (1-Out) * Err; } } Where NET and LAYER are structs defined as: typedef struct { /* A LAYER OF A NET: */ INT Units; /* - number of units in this layer */ REAL* Output; /* - output of ith unit */ REAL* Error; /* - error term of ith unit */ REAL** Weight; /* - connection weights to ith unit */ REAL** WeightSave; /* - saved weights for stopped training */ REAL** dWeight; /* - last weight deltas for momentum */ } LAYER; typedef struct { /* A NET: */ LAYER** Layer; /* - layers of this net */ LAYER* InputLayer; /* - input layer */ LAYER* OutputLayer; /* - output layer */ REAL Alpha; /* - momentum factor */ REAL Eta; /* - learning rate */ REAL Gain; /* - gain of sigmoid function */ REAL Error; /* - total net error */ } NET; What I could think of is to first convert the 2d Weight into 1d. And then send it to kernel to take the product or just use the CUBLAS library. Any suggestions?

    Read the article

1