We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent ba14df1 commit 2763ebaCopy full SHA for 2763eba
fbgemm_gpu/experimental/gen_ai/CMakeLists.txt
@@ -149,7 +149,18 @@ gpu_cpp_library(
149
${experimental_gen_ai_cpp_source_files_hip}
150
TORCH_LIBS
151
# Used when building as part of PyTorch
152
- ${FBGEMM_GENAI_TORCH_LIBS})
+ ${FBGEMM_GENAI_TORCH_LIBS}
153
+ HIPCC_FLAGS
154
+ # Below flags are required for strong CK performance
155
+ # on certain kernel instances
156
+ -mllvm
157
+ # Reduce register spillage on certain kernels
158
+ -amdgpu-coerce-illegal-types=1
159
160
+ -enable-post-misched=0
161
162
+ -greedy-reverse-local-assignment=1
163
+ -fhip-new-launch-api)
164
165
166
0 commit comments