Skip to content

Commit 2742157

Browse files
committed
2 parents d0c66a2 + 3382816 commit 2742157

File tree

12 files changed

+213
-258
lines changed

12 files changed

+213
-258
lines changed

README.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
2+
3+
<div align="center">
4+
<img width="500px" src="doc/images/KernelTuner-logo.png"/>
5+
</div>
6+
7+
---
8+
[![Build Status](https://github.com/KernelTuner/kernel_tuner/actions/workflows/build-test-python-package.yml/badge.svg)](https://github.com/KernelTuner/kernel_tuner/actions/workflows/build-test-python-package.yml)
9+
[![CodeCov Badge](https://codecov.io/gh/KernelTuner/kernel_tuner/branch/master/graph/badge.svg)](https://codecov.io/gh/KernelTuner/kernel_tuner)
10+
[![PyPi Badge](https://img.shields.io/pypi/v/kernel_tuner.svg?colorB=blue)](https://pypi.python.org/pypi/kernel_tuner/)
11+
[![Zenodo Badge](https://zenodo.org/badge/54894320.svg)](https://zenodo.org/badge/latestdoi/54894320)
12+
[![SonarCloud Badge](https://sonarcloud.io/api/project_badges/measure?project=KernelTuner_kernel_tuner&metric=alert_status)](https://sonarcloud.io/dashboard?id=KernelTuner_kernel_tuner)
13+
[![OpenSSF Badge](https://bestpractices.coreinfrastructure.org/projects/6573/badge)](https://bestpractices.coreinfrastructure.org/projects/6573)
14+
[![FairSoftware Badge](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green)](https://fair-software.eu)
15+
---
16+
17+
18+
Create optimized GPU applications in any mainstream GPU
19+
programming language (CUDA, HIP, OpenCL, OpenACC).
20+
21+
What Kernel Tuner does:
22+
23+
- Works as an external tool to benchmark and optimize GPU kernels in isolation
24+
- Can be used directly on existing kernel code without extensive changes
25+
- Can be used with applications in any host programming language
26+
- Blazing fast search space construction
27+
- More than 20 [optimization algorithms](https://kerneltuner.github.io/kernel_tuner/stable/optimization.html) to speedup tuning
28+
- Energy measurements and optimizations [(power capping, clock frequency tuning)](https://arxiv.org/abs/2211.07260)
29+
- ... and much more! For example, [caching](https://kerneltuner.github.io/kernel_tuner/stable/cache_files.html), [output verification](https://kerneltuner.github.io/kernel_tuner/stable/correctness.html), [tuning host and device code](https://kerneltuner.github.io/kernel_tuner/stable/hostcode.html), [user defined metrics](https://kerneltuner.github.io/kernel_tuner/stable/metrics.html), see [the full documentation](https://kerneltuner.github.io/kernel_tuner/stable/index.html).
30+
31+
32+
33+
## Installation
34+
35+
- First, make sure you have your [CUDA](https://kerneltuner.github.io/kernel_tuner/stable/install.html#cuda-and-pycuda), [OpenCL](https://kerneltuner.github.io/kernel_tuner/stable/install.html#opencl-and-pyopencl), or [HIP](https://kerneltuner.github.io/kernel_tuner/stable/install.html#hip-and-pyhipl) compiler installed
36+
- Then type: `pip install kernel_tuner[cuda]`, `pip install kernel_tuner[opencl]`, or `pip install kernel_tuner[hip]`
37+
- or why not all of them: `pip install kernel_tuner[cuda,opencl,hip]`
38+
39+
More information on installation, also for other languages, in the [installation guide](http://kerneltuner.github.io/kernel_tuner/stable/install.html).
40+
41+
## Example
42+
43+
```python
44+
import numpy as np
45+
from kernel_tuner import tune_kernel
46+
47+
kernel_string = """
48+
__global__ void vector_add(float *c, float *a, float *b, int n) {
49+
int i = blockIdx.x * block_size_x + threadIdx.x;
50+
if (i<n) {
51+
c[i] = a[i] + b[i];
52+
}
53+
}
54+
"""
55+
56+
n = np.int32(10000000)
57+
58+
a = np.random.randn(n).astype(np.float32)
59+
b = np.random.randn(n).astype(np.float32)
60+
c = np.zeros_like(a)
61+
62+
args = [c, a, b, n]
63+
64+
tune_params = {"block_size_x": [32, 64, 128, 256, 512]}
65+
66+
tune_kernel("vector_add", kernel_string, n, args, tune_params)
67+
```
68+
69+
More [examples here](https://kerneltuner.github.io/kernel_tuner/stable/examples.html).
70+
71+
## Resources
72+
73+
- [Full documentation](https://kerneltuner.github.io/kernel_tuner/stable/)
74+
- Guides:
75+
- [Getting Started](https://kerneltuner.github.io/kernel_tuner/stable/quickstart.html)
76+
- [Convolution](https://kerneltuner.github.io/kernel_tuner/stable/convolution.html)
77+
- [Diffusion](https://kerneltuner.github.io/kernel_tuner/stable/diffusion.html)
78+
- [Matrix Multiplication](https://kerneltuner.github.io/kernel_tuner/stable/matrix_multiplication.html)
79+
- Features & Use cases:
80+
- [Full list of examples](https://kerneltuner.github.io/kernel_tuner/stable/examples.html)
81+
- [Output verification](https://kerneltuner.github.io/kernel_tuner/stable/correctness.html)
82+
- [Test GPU code from Python](https://github.com/KernelTuner/kernel_tuner/blob/master/examples/cuda/test_vector_add.py)
83+
- [Tune code in both host and device code](https://kerneltuner.github.io/kernel_tuner/stable/hostcode.html)
84+
- [Optimization algorithms](https://kerneltuner.github.io/kernel_tuner/stable/optimization.html)
85+
- [Mixed-precision & Accuracy tuning](https://github.com/KernelTuner/kernel_tuner/blob/master/examples/cuda/accuracy.py)
86+
- [Custom metrics & tuning objectives](https://kerneltuner.github.io/kernel_tuner/stable/metrics.html)
87+
- **Kernel Tuner Tutorial** slides [[PDF]](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/slides/2022_SURF/SURF22-Kernel-Tuner-Tutorial.pdf), hands-on:
88+
- Vector add example [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/00_Kernel_Tuner_Introduction.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/00_Kernel_Tuner_Introduction.ipynb)
89+
- Tuning thread block dimensions [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/01_Kernel_Tuner_Getting_Started.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/01_Kernel_Tuner_Getting_Started.ipynb)
90+
- Search space restrictions & output verification [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/02_Kernel_Tuner_Intermediate.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/02_Kernel_Tuner_Intermediate.ipynb)
91+
- Visualization & search space optimization [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/03_Kernel_Tuner_Advanced.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/hands-on/cuda/03_Kernel_Tuner_Advanced.ipynb)
92+
- **Energy Efficient GPU Computing** tutorial slides [[PDF]](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/slides/2023_Supercomputing/SC23.pdf), hands-on:
93+
- Kernel Tuner for GPU energy measurements [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/energy/00_Kernel_Tuner_Introduction.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/energy/00_Kernel_Tuner_Introduction.ipynb)
94+
- Code optimizations for energy [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/energy/01_Code_Optimizations_for_Energy.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/energy/01_Code_Optimizations_for_Energy.ipynb)
95+
- Mixed precision and accuracy tuning [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/energy/02_Mixed_precision_programming.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/energy/02_Mixed_precision_programming.ipynb)
96+
- Optimzing for time vs for energy [[.ipynb](https://github.com/KernelTuner/kernel_tuner_tutorial/blob/master/energy/03_energy_efficient_computing.ipynb)] [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KernelTuner/kernel_tuner_tutorial/blob/master/energy/03_energy_efficient_computing.ipynb)
97+
98+
99+
## Kernel Tuner ecosystem
100+
101+
<img width="250px" src="doc/images/kernel_launcher.png"/><br />C++ magic to integrate auto-tuned kernels into C++ applications
102+
103+
<img width="250px" src="doc/images/kernel_float.png"/><br />C++ data types for mixed-precision CUDA kernel programming
104+
105+
<img width="275px" src="doc/images/kernel_dashboard.png"/><br />Monitor, analyze, and visualize auto-tuning runs
106+
107+
108+
## Communication & Contribution
109+
110+
- GitHub [Issues](https://github.com/KernelTuner/kernel_tuner/issues): Bug reports, install issues, feature requests, work in progress
111+
- GitHub [Discussion group](https://github.com/orgs/KernelTuner/discussions): General questions, Q&A, thoughts
112+
113+
Contributions are welcome! For feature requests, bug reports, or usage problems, please feel free to create an issue.
114+
For more extensive contributions, check the [contribution guide](http://kerneltuner.github.io/kernel_tuner/stable/contributing.html).
115+
116+
## Citation
117+
118+
If you use Kernel Tuner in research or research software, please cite the most relevant among the [publications on Kernel
119+
Tuner](https://kerneltuner.github.io/kernel_tuner/stable/#citation). To refer to the project as a whole, please cite:
120+
121+
```latex
122+
@article{kerneltuner,
123+
author = {Ben van Werkhoven},
124+
title = {Kernel Tuner: A search-optimizing GPU code auto-tuner},
125+
journal = {Future Generation Computer Systems},
126+
year = {2019},
127+
volume = {90},
128+
pages = {347-358},
129+
url = {https://www.sciencedirect.com/science/article/pii/S0167739X18313359},
130+
doi = {https://doi.org/10.1016/j.future.2018.08.004}
131+
}
132+
```
133+

README.rst

Lines changed: 0 additions & 209 deletions
This file was deleted.

doc/images/KernelTuner-logo.png

50.5 KB
Loading

doc/images/dashboard.png

304 KB
Loading

doc/images/kernel_dashboard.png

63.5 KB
Loading

doc/images/kernel_float.png

50 KB
Loading

doc/images/kernel_launcher.png

67.4 KB
Loading

0 commit comments

Comments
 (0)