Memory Coalescing for efficient global memory transfers in CUDA C.
Video Notes: https://0mean1sigma.com/chapter-4-memory-coalescing-and-tiled-matrix-multiplication/
Code Repository: https://github.com/tgautam03/CUDA-C/tree/master/04_sq_mat_mul
Animations: https://github.com/tgautam03/0Mean1Sigma/tree/master/CUDA_03
00:00 - Introduction
00:52 - Global Memory in GPUs
02:00 - Coalesced Memory Access
03:07 - Uncoalesced Memory Access
04:03 - FLOP Analysis
05:43 - Conclusion