Give a session on C++ AMP – here is how

Posted by Daniel Moth on Daniel Moth See other posts from Daniel Moth or by Daniel Moth
Published on Thu, 22 Sep 2011 01:53:49 GMT Indexed on 2011/11/22 18:12 UTC
Read the original article Hit count: 384

Filed under:

GPGPU

|

ParallelComputing

Ever since presenting on C++ AMP at the AMD Fusion conference in June, then the Gamefest conference in August, and the BUILD conference in September, I've had numerous requests about my material from folks that want to re-deliver the same session. The C++ AMP session I put together has evolved over the 3 presentations to its final form that I used at BUILD, so that is the one I recommend you base yours on.

Please get the slides and the recording from channel9 (I'll refer to slide numbers below).

This is how I've been presenting the C++ AMP session:

Context

(slide 3, 04:18-08:18) Start with a demo, on my dual-GPU machine. I've been using the N-Body sample (for VS 11 Developer Preview).
(slide 4) Use an nvidia slide that has additional examples of performance improvements that customers enjoy with heterogeneous computing.
(slide 5) Talk a bit about the differences today between CPU and GPU hardware, leading to the fact that these will continue to co-exist and that GPUs are great for data parallel algorithms, but not much else today. One is a jack of all trades and the other is a number cruncher.
(slide 6) Use the APU example from amd, as one indication that the hardware space is still in motion, emphasizing that the C++ AMP solution is a data parallel API, not a GPU API. It has a future proof design for hardware we have yet to see.
(slide 7) Provide more meta-data, as blogged about when I first introduced C++ AMP.

Code

(slide 9-11) Introduce C++ AMP coding with a simplistic array-addition algorithm – the slides speak for themselves.
(slide 12-13) index<N>, extent<N>, and grid<N>.
(Slide 14-16) array<T,N>, array_view<T,N> and comparison between them.
(Slide 17) parallel_for_each.
(slide 18, 21) restrict.
(slide 19-20) actual restrictions of restrict(direct3d) – the slides speak for themselves.
(slide 22) bring it altogether with a matrix multiplication example.
(slide 23-24) accelerator, and accelerator_view.
(slide 26-29) Introduce tiling incl. tiled matrix multiplication [tiling probably deserves a whole session instead of 6 minutes!].

IDE

(slide 34,37) Briefly touch on the concurrency visualizer. It supports GPU profiling, but enhancements specific to C++ AMP we hope will come at the Beta timeframe, which is when I'll be spending more time talking about it.
(slide 35-36, 51:54-59:16) Demonstrate the GPU debugging experience in VS 11.

Summary

(slide 39) Re-iterate some of the points of slide 7, and add the point that the C++ AMP spec will be open for other compiler vendors to implement, even on other platforms (in fact, Microsoft is actively working on that).
(slide 40) Links to content – see slide – including where all your questions should go: http://social.msdn.microsoft.com/Forums/en/parallelcppnative/threads.

"But I don't have time for a full blown session, I only need 2 (or just 1, or 3) C++ AMP slides to use in my session on related topic X"

If all you want is a small number of slides, you can take some from the session above and customize them. But because I am so nice, I have created some slides for you, including talking points in the notes section. Download them here.

Comments about this post by Daniel Moth welcome at the original blog.

Developer IT