Skip to content

Commit 2de12be

Browse files
authored
[ROCm] [AITER] [Bugfix] Patch for AITER commit 648764942e552a8bb5fe16026703716a81f05374 (#18990)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
1 parent 83ca9ae commit 2de12be

File tree

2 files changed

+4
-3
lines changed

2 files changed

+4
-3
lines changed

docker/Dockerfile.rocm_base

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ARG PYTORCH_REPO="https://github.com/pytorch/pytorch.git"
1212
ARG PYTORCH_VISION_REPO="https://github.com/pytorch/vision.git"
1313
ARG FA_BRANCH="1a7f4dfa"
1414
ARG FA_REPO="https://github.com/Dao-AILab/flash-attention.git"
15-
ARG AITER_BRANCH="c1debd8"
15+
ARG AITER_BRANCH="6487649"
1616
ARG AITER_REPO="https://github.com/ROCm/aiter.git"
1717

1818
FROM ${BASE_IMAGE} AS base

vllm/model_executor/layers/fused_moe/rocm_aiter_fused_moe.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,9 @@ class QuantMethod(IntEnum):
2222
NO = 0 # a16w16
2323
PER_TENSOR = 1 # w8a8 (pre_Tensor)
2424
PER_TOKEN = 2 # w8a8/w8a4 (per_Token)
25-
BLOCK_1X128 = 3 # block quantized w8a8 (per_1x128)
26-
BLOCK_128x128 = 4 # block quantized w8a8 (per_128x128)
25+
BLOCK_1X32 = 3 # fp4x2
26+
BLOCK_1X128 = 4 # block quantized w8a8 (per_1x128)
27+
BLOCK_128x128 = 5 # block quantized w8a8 (per_128x128)
2728

2829

2930
class ActivationMethod(IntEnum):

0 commit comments

Comments
 (0)