Skip to content

Commit 919ce38

Browse files
committed
Add CUDA non-contiguous Unary ops implementation
1 parent c4ecdef commit 919ce38

File tree

4 files changed

+52343
-6554
lines changed

4 files changed

+52343
-6554
lines changed

docs/ops.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Legend:
99

1010
| Operation | BLAS | CPU | CUDA | Metal |
1111
|-----------|------|------|------|------|
12-
| ABS ||| 🟡 ||
12+
| ABS ||| ||
1313
| ACC |||||
1414
| ADD |||| 🟡 |
1515
| ADD1 |||||
@@ -31,20 +31,20 @@ Legend:
3131
| DIV |||| 🟡 |
3232
| DUP ||| 🟡 | 🟡 |
3333
| ELU |||| 🟡 |
34-
| EXP ||| 🟡 ||
34+
| EXP ||| ||
3535
| FLASH_ATTN_EXT ||| 🟡 | 🟡 |
3636
| GATED_LINEAR_ATTN |||||
3737
| GEGLU |||| 🟡 |
3838
| GEGLU_ERF |||| 🟡 |
3939
| GEGLU_QUICK |||| 🟡 |
40-
| GELU ||| 🟡 | 🟡 |
41-
| GELU_ERF ||| 🟡 | 🟡 |
42-
| GELU_QUICK ||| 🟡 | 🟡 |
40+
| GELU ||| | 🟡 |
41+
| GELU_ERF ||| | 🟡 |
42+
| GELU_QUICK ||| | 🟡 |
4343
| GET_ROWS ||| 🟡 ||
4444
| GET_ROWS_BACK || 🟡 | 🟡 ||
4545
| GROUP_NORM |||||
46-
| HARDSIGMOID ||| 🟡 ||
47-
| HARDSWISH ||| 🟡 ||
46+
| HARDSIGMOID ||| ||
47+
| HARDSWISH ||| ||
4848
| IM2COL |||| 🟡 |
4949
| L2_NORM |||||
5050
| LEAKY_RELU |||||
@@ -53,15 +53,15 @@ Legend:
5353
| MUL |||| 🟡 |
5454
| MUL_MAT | 🟡 | 🟡 | 🟡 | 🟡 |
5555
| MUL_MAT_ID |||||
56-
| NEG ||| 🟡 | 🟡 |
56+
| NEG ||| | 🟡 |
5757
| NORM |||| 🟡 |
5858
| OPT_STEP_ADAMW |||||
5959
| OUT_PROD | 🟡 | 🟡 | 🟡 ||
6060
| PAD |||||
6161
| PAD_REFLECT_1D |||||
6262
| POOL_2D |||||
6363
| REGLU |||| 🟡 |
64-
| RELU ||| 🟡 | 🟡 |
64+
| RELU ||| | 🟡 |
6565
| REPEAT ||| 🟡 ||
6666
| REPEAT_BACK |||||
6767
| RMS_NORM |||| 🟡 |
@@ -74,9 +74,9 @@ Legend:
7474
| SCALE |||||
7575
| SET |||||
7676
| SET_ROWS || 🟡 || 🟡 |
77-
| SGN ||| 🟡 ||
78-
| SIGMOID ||| 🟡 | 🟡 |
79-
| SILU ||| 🟡 | 🟡 |
77+
| SGN ||| ||
78+
| SIGMOID ||| | 🟡 |
79+
| SILU ||| | 🟡 |
8080
| SILU_BACK |||||
8181
| SIN |||| 🟡 |
8282
| SOFT_MAX |||||
@@ -85,11 +85,11 @@ Legend:
8585
| SQRT |||| 🟡 |
8686
| SSM_CONV |||||
8787
| SSM_SCAN |||||
88-
| STEP ||| 🟡 ||
88+
| STEP ||| ||
8989
| SUB |||| 🟡 |
9090
| SUM |||||
9191
| SUM_ROWS |||||
9292
| SWIGLU |||| 🟡 |
93-
| TANH ||| 🟡 | 🟡 |
93+
| TANH ||| | 🟡 |
9494
| TIMESTEP_EMBEDDING |||||
9595
| UPSCALE |||| 🟡 |

0 commit comments

Comments
 (0)