Skip to content

colesmcintosh/pycuda-numpy-vector-ops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Accelerating NumPy Vector Operations with PyCUDA

This notebook demonstrates how to accelerate large-scale NumPy operations using GPU programming in Python via PyCUDA.

We compare traditional CPU-based NumPy operations with a GPU-accelerated fused multiply-add (FMA) operation:

The operation is defined as $c[i] = a[i] \times b[i] + d[i]$.

The notebook uses:

  • Pinned (page-locked) memory for faster host-device transfers
  • CUDA streams for asynchronous execution
  • Event timing for accurate benchmarks

The result is a fast, validated comparison of NumPy vs PyCUDA performance.

About

Accelerating NumPy Vector Operations with PyCUDA

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published