NVIDIA / cutlass Public

Notifications You must be signed in to change notification settings
Fork 1.2k
Star 7.4k

Code
Issues 265
Pull requests 40
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: NVIDIA/cutlass

Beta

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

265 Open 1,092 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[QST] Will cutlass add MLA implementation for non Blackwell?

#2264 opened Apr 25, 2025 by ghostplant

[QST] How to use Grouped_gemm when MNK is not a regular shape ? - Needs Triage question

Question

#2263 opened Apr 25, 2025 by chenhongyu2048

[FEA] Has CUTLASS considered supporting Zero-points and block-wise scaling in Hoppr Mixed Grouped Gemm recently? ? - Needs Triage feature request

New feature or request

#2261 opened Apr 24, 2025 by mengsoso

[QST] Why only N={64, 128, 192, 256} is supported for mxf8f6f4 GeMM? ? - Needs Triage question

Question

#2260 opened Apr 24, 2025 by Algy

[QST] When do we use as_position_independent_swizzle_tensor() ? ? - Needs Triage question

Question

#2259 opened Apr 24, 2025 by flytigerw

[QST]How to write column major output epilogue in cutlass2.x ? - Needs Triage question

Question

#2258 opened Apr 23, 2025 by tlogn

[FEA] CVT F32 -> TF32 PTX for sm80 ? - Needs Triage feature request

New feature or request

#2254 opened Apr 19, 2025 by osayamenja

[BUG] Compile warning/error - -Werror=unused-but-set-variable unused curr_stride in cute/layout.hpp via flash attention. ? - Needs Triage bug

Something isn't working

#2253 opened Apr 19, 2025 by Skylion007

[QST] how to use vectorized copy when size is not perfectly aligned? ? - Needs Triage question

Question

#2252 opened Apr 19, 2025 by botbw

[QST] whats the difference between OpMultiplyAddFastAccum and OpMultiplyAdd tag? ? - Needs Triage question

Question

#2246 opened Apr 17, 2025 by danielhua23

[FEA][torchinductor-EVT] tensor construction API that takes in shape + stride directly ? - Needs Triage feature request

New feature or request

#2245 opened Apr 17, 2025 by mlazos

[FEA][torchinductor-EVT] Allow function source code to be passed directly to EVT tracer ? - Needs Triage feature request

New feature or request

#2244 opened Apr 17, 2025 by mlazos

[BUG][torchinductor-EVT] cutlass/python/cutlass has extra dependencies that aren't needed for EVT ? - Needs Triage bug

Something isn't working

#2243 opened Apr 16, 2025 by mlazos

[QST][CUTE] How to systematically and easily compose the cute layout algebra operations to get custom layout ? - Needs Triage question

Question

#2241 opened Apr 16, 2025 by danielhua23

[QST] why use AutoVectoringCopy rather LDSM in hopper mixed gemm? ? - Needs Triage question

Question

#2239 opened Apr 13, 2025 by danielhua23

[FEA] Blackwell support for python libraries. ? - Needs Triage feature request

New feature or request

#2237 opened Apr 12, 2025 by eliphatfs

[QST]Do we need to tune cutlass gemm to use it for all shape? ? - Needs Triage question

Question

#2235 opened Apr 11, 2025 by sleepwalker2017

[QST] How does make_tiled_copy_A determine the source address for copying？ ? - Needs Triage question

Question

#2232 opened Apr 10, 2025 by ezioliao

[BUG] No TMEM allocation in Blackwell CuTe tutorial examples ? - Needs Triage bug

Something isn't working

#2230 opened Apr 8, 2025 by allispaul

[QST] What's the difference between make_fragment_like and make_tensor_like? ? - Needs Triage question

Question

#2228 opened Apr 7, 2025 by iryanin

[DOC] all images in cutlass/media/docs/cpp/cute cannot display ? - Needs Triage documentation

Documentation

#2227 opened Apr 7, 2025 by hlyix

[DOC]构建 cutlass_profiler ? - Needs Triage documentation

Documentation

#2226 opened Apr 7, 2025 by ll000x

[FEA]examples/cute/tutorial/tiled_copy.cu parameterization？ ? - Needs Triage feature request

New feature or request

#2225 opened Apr 6, 2025 by ziyuhuang123

[BUG] Blackwell MLA perf for split-k ? - Needs Triage bug

Something isn't working

#2222 opened Apr 4, 2025 by divchenko

[QST] How to pack int4 tensor correctly in PyTorch ? - Needs Triage question

Question

#2218 opened Apr 3, 2025 by yicwang

Previous 1 2 3 4 5 … 10 11 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-04-22.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly