Read vector into CUDA shared memory

Posted by Ben on Stack Overflow See other posts from Stack Overflow or by Ben
Published on 2010-05-31T07:21:33Z Indexed on 2010/05/31 7:22 UTC
Read the original article Hit count: 437

Filed under:

cuda

I am new to CUDA and programming GPUs. I need each thread in my block to use a vector of length ndim. So I thought I might do something like this:

extern __shared__ float* smem[];
...
if (threadIddx.x == 0) {
   for (int d=0; d<ndim; ++d) {
       smem[d] = vector[d];
   }
}
__syncthreads();
...

This works fine. However, I seems wasteful that a single thread should do all loading, so I changed the code to

if (threadIdx.x < ndim) {
   smem[threadIdx.x] = vector[threadIdx.x];
}

__syncthreads();

which does not work. Why? It gives different results than the above code even when ndim << blockDim.x.

Related posts about cuda

CUDA Driver API vs. CUDA runtime

as seen on Stack Overflow - Search for 'Stack Overflow'
When writing CUDA applications, you can either work at the driver level or at the runtime level as illustrated on this image (The libraries are CUFFT and CUBLAS for advanced math): I assume the tradeoff between the two are increased performance for the low-evel API but at the cost of increased… >>> More
Updating a Cuda 4.0 project to Cuda 4.2

as seen on Stack Overflow - Search for 'Stack Overflow'
I have a VS2010 project that was tested with CUDA 4.0, today I installed CUDA 4.2 and I want to update this project, the problem is that when I try to run the project it asks me for cudart32_40_17.dll, but since this is CUDA 4.2 I only have on my folders (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4… >>> More
How to solve CUDA crash when run CUDA example fluidsGL?

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I use ubuntu 12.04 64 bits with GTX560Ti. I install CUDA by following instruction: wget http: //developer.download.nvidia.com/compute/cuda/4_2/rel/toolkit/cudatoolkit_4.2.9_lin ux_64_ubuntu11.04.run wget http: //developer.download.nvidia.com/compute/cuda/4_2/rel/drivers/devdriver_4… >>> More
Context migration in CUDA.NET

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm currently using CUDA.NET library by GASS. I need to initialize cuda arrays (actually cublas vectors, but it doesn't matters) in one CPU thread and use them in other CPU thread. But CUDA context which holding all initialized arrays and loaded functions, can be attached to only one CPU thread. There… >>> More
CUDA on GeForce 8600GT

as seen on Super User - Search for 'Super User'
I have got the cuda driver, toolkit and sdk installed in Ubuntu 10.04. I'm using nVidia Geforce 8600 GT card. Official website says my card is CUDA supported. But on running the deviceQuery that comes with the cuda sdk, I'm getting the following output. ./deviceQuery Starting... CUDA Device Query… >>> More

Developer IT

Read vector into CUDA shared memory - Developer IT

Read vector into CUDA shared memory

cuda

Related posts about cuda

CUDA Driver API vs. CUDA runtime

Updating a Cuda 4.0 project to Cuda 4.2

How to solve CUDA crash when run CUDA example fluidsGL?

Context migration in CUDA.NET

CUDA on GeForce 8600GT

Categories cloud