NVIDIA CUDA Tutorial 10: Blocking with Shared Memory

19.678 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

NVIDIA CUDA Tutorial 10: Blocking with Shared Memory

In this tute we'll use a technique called blocking to finally fulfill Porky Water's tall order!

Blocking is a technique where blocks of data are copied from global memory to shared memory, threads work on the data in the much faster shared memory. This greatly reduces the amount of traffic on the global memory bus and allows threads to use the much faster shared memory for most of the calculations.

Blocking with shared memory gives us a great speed up here and easily fulfills Porky's boss's request of a 10x speed up. There's some small changes that could allow the code to run a little quicker but if the code had to run much faster a complete change in algorithm would be far more useful than tweaking this brute force one.					

NVIDIA CUDA Tutorial 10: Blocking with Shared Memory

Nhạc Theo Chủ Đề

Liên kết website