Counting FLOPS/GFLOPS in program - CUDA
Posted
by msx
on Stack Overflow
See other posts from Stack Overflow
or by msx
Published on 2010-05-08T19:30:07Z
Indexed on
2010/05/08
19:38 UTC
Read the original article
Hit count: 378
Already finished my application which multiplies CRS matrix and vector (SpMV) and the only thing to do now is to count FLOPS my application did. In my opinion it's really hard to estimate number of floating point operation in case of sparse matrix - vector multiplication, because the number of multiplies in one row is really "jumpy" or fluent.
I only tried to measure time using "cudaprof" ( available in ./CUDA/bin directory) - it works fine.
Any sugestions and instruction pastes appreciated !
© Stack Overflow or respective owner