Skip to content

Commit 28cb35a

Browse files
author
Michael Kesper
authored
make : add LLAMA_HIP_UMA option (#4587)
NB: LLAMA_HIP_UMA=1 (or any value) adds MK_CPPFLAG -DGGML_HIP_UMA
1 parent f31b984 commit 28cb35a

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -452,6 +452,9 @@ ifdef LLAMA_HIPBLAS
452452
LLAMA_CUDA_MMV_Y ?= 1
453453
LLAMA_CUDA_KQUANTS_ITER ?= 2
454454
MK_CPPFLAGS += -DGGML_USE_HIPBLAS -DGGML_USE_CUBLAS
455+
ifdef LLAMA_HIP_UMA
456+
MK_CPPFLAGS += -DGGML_HIP_UMA
457+
endif # LLAMA_HIP_UMA
455458
MK_LDFLAGS += -L$(ROCM_PATH)/lib -Wl,-rpath=$(ROCM_PATH)/lib
456459
MK_LDFLAGS += -lhipblas -lamdhip64 -lrocblas
457460
HIPFLAGS += $(addprefix --offload-arch=,$(GPU_TARGETS))

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -440,7 +440,13 @@ Building the program with BLAS support may lead to some performance improvements
440440
&& cmake --build build -- -j 16
441441
```
442442
On Linux it is also possible to use unified memory architecture (UMA) to share main memory between the CPU and integrated GPU by setting `-DLLAMA_HIP_UMA=ON"`.
443-
However, this hurts performance for non-integrated GPUs.
443+
However, this hurts performance for non-integrated GPUs (but enables working with integrated GPUs).
444+
445+
- Using `make` (example for target gfx1030, build with 16 CPU threads):
446+
```bash
447+
make -j16 LLAMA_HIPBLAS=1 LLAMA_HIP_UMA=1 AMDGPU_TARGETS=gxf1030
448+
```
449+
444450
- Using `CMake` for Windows (using x64 Native Tools Command Prompt for VS, and assuming a gfx1100-compatible AMD GPU):
445451
```bash
446452
set PATH=%HIP_PATH%\bin;%PATH%

0 commit comments

Comments
 (0)