Writing CUDA kernels in Python with Numba

Writing CUDA kernels in Python with Numba

6.640 Lượt nghe
Writing CUDA kernels in Python with Numba
On February 15th (21:00 MSK - UTC+3), we talked about writing CUDA kernels in Python with Numba. Abstract: Numba is a Just-in-Time Compiler that enables users to write their own CUDA kernels in Python. Although it is possible to mix CUDA C/C++ code with Python through various means (e.g. PyCUDA), Python programmer productivity typically falls drastically when they need to write kernels in another language. This talk provides an introduction to Numba and some of the applications it is used in. In many cases, Numba is just one tool in the CUDA-accelerated Python toolbox, being used in pipelines alongside other libraries including CuPy (an array library), cuSignal (for signal processing), RAPIDS (for data science / AI / ML), PyTorch, Tensorflow, and JAX, etc. This enables users to quickly build applications using standard functionality from domain-specific libraries, and have the flexibility to implement custom GPU kernels for functionality that is novel to their application. After watching this talk, attendees should be able to use Numba in their own Python projects to implement custom kernels alongside other Python CUDA libraries, or standalone to implement algorithms from scratch. Speaker: Graham Markall is a Senior Software Engineer in the RAPIDS team at NVIDIA, where he maintains Numba’s CUDA target and supports its use in the RAPIDS libraries. His interests lie at the intersection of compilers, high-performance computing, and numerical methods. Twitter: https://twitter.com/gmarkall Linkedin: https://www.linkedin.com/in/graham-markall-0087a215/ Github: https://github.com/gmarkall/