4.5x Faster CUDA C with just Two Variable Changes || Episode 3: Memory Coalescing

4.5x Faster CUDA C with just Two Variable Changes || Episode 3: Memory Coalescing

2.615 Lượt nghe
4.5x Faster CUDA C with just Two Variable Changes || Episode 3: Memory Coalescing
Memory Coalescing for efficient global memory transfers in CUDA C. Video Notes: https://0mean1sigma.com/chapter-4-memory-coalescing-and-tiled-matrix-multiplication/ Code Repository: https://github.com/tgautam03/CUDA-C/tree/master/04_sq_mat_mul Animations: https://github.com/tgautam03/0Mean1Sigma/tree/master/CUDA_03 00:00 - Introduction 00:52 - Global Memory in GPUs 02:00 - Coalesced Memory Access 03:07 - Uncoalesced Memory Access 04:03 - FLOP Analysis 05:43 - Conclusion