Fundamentals of Accelerated Computing with CUDA Python

September 05, 9:00 am - 12:15 pm (CEST)

Speakers: Karen Bradshaw

Tutorial website:

Agenda: Fundamentals of Accelerated Computing with CUDA Python

This tutorial teaches the fundamental tools and techniques for running GPU-accelerated Python applications using CUDA® and the NUMBA compilers. Participants will work through dozens of hands-on coding exercises, and at the end of the training, implement a new workflow to accelerate an example program originally designed for CPUs, observing impressive performance gains.

On completion of this tutorial, participants will have an understanding of the fundamental tools and techniques for GPU-accelerated Python applications with CUDA and Numba including the ability to:

  • GPU-accelerate NumPy ufuncs with a few lines of code
  • Configure code parallelization using the CUDA thread hierarchy
  • Write custom CUDA device kernels for maximum performance and flexibility
  • Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth
  • Generate random numbers on the GPU

Karen Bradshaw. Associate Professor in the Department of Computer Science, Rhodes University, South Africa. NVIDIA Deep Learning Institute (DLI) Certified Instructor in CUDA C/C++ and CUDA Python. PhD in Computer Science from Cambridge University, UK
Research interests: Distributed and parallel programming (including GPGPU), programming languages, simulation and modeling.
For more information, see departmental website here