Counting FLOPS/GFLOPS in program - CUDA

Posted by msx on Stack Overflow See other posts from Stack Overflow or by msx
Published on 2010-05-08T19:30:07Z Indexed on 2010/05/08 19:38 UTC
Read the original article Hit count: 378

Filed under:
|
|

Already finished my application which multiplies CRS matrix and vector (SpMV) and the only thing to do now is to count FLOPS my application did. In my opinion it's really hard to estimate number of floating point operation in case of sparse matrix - vector multiplication, because the number of multiplies in one row is really "jumpy" or fluent.

I only tried to measure time using "cudaprof" ( available in ./CUDA/bin directory) - it works fine.

Any sugestions and instruction pastes appreciated !

© Stack Overflow or respective owner

Related posts about cuda

Related posts about nvidia