Replies: 6 comments 1 reply
-
run log (base) root@lf:~/ktransformers# cat start.sh
(kt) root@lf:~/ktransformers# bash start.sh |
Beta Was this translation helpful? Give feedback.
-
你这是编译的DEBUG 版本么? |
Beta Was this translation helpful? Give feedback.
-
您好,请问下您在海光dcu平台编译kt的时候有遇到过hipcc编译器报错“未指定nvidia还是amd平台”的这个错误吗,我看了编译命令,flag中有“-D__HIP_PLATFORM_AMD__=1”,但报错显示我并未指定该属性。相关日志如下: |
Beta Was this translation helpful? Give feedback.
-
没碰到类似情况,不过我编译的是0.23post1版本,是不是您编译的版本比较新?在发现性能不太令人满意之后我没有继续跟进这个硬件平台。
…________________________________
发件人: BrainCH ***@***.***>
发送时间: 2025年4月12日 23:47
收件人: kvcache-ai/ktransformers ***@***.***>
抄送: dahema ***@***.***>; Author ***@***.***>
主题: Re: [kvcache-ai/ktransformers] Hygon DCU K100AI get UP but very SLOW (Discussion #999)
您好,请问下您在海光dcu平台编译kt的时候有遇到过hipcc编译器报错“未指定nvidia还是amd平台”的这个错误吗,我看了编译命令,flag中有“-D__HIP_PLATFORM_AMD__=1”,但报错显示我并未指定该属性。相关日志如下:
/opt/dtk/bin/hipcc -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/TH -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/THC -I/root/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/THH -I/opt/dtk/include -I/root/miniconda3/envs/kt/include/python3.11 -c ktransformers/ktransformers_ext/hip/custom_gguf/dequant.hip -o /tmp/tmpaup26x2z.build-temp/ktransformers/ktransformers_ext/hip/custom_gguf/dequant.o -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -use_fast_math -Xcompiler -fPIC -DKTRANSFORMERS_USE_CUDA -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1014" -DTORCH_EXTENSION_NAME=KTransformersOps -D_GLIBCXX_USE_CXX11_ABI=1 --offload-arch=gfx906 --offload-arch=gfx926 --offload-arch=gfx928 --offload-arch=gfx936 -fno-gpu-rdc -std=c++17
In file included from ktransformers/ktransformers_ext/hip/custom_gguf/dequant.hip:12:
/opt/dtk/include/hip/hip_runtime.h:66:2: error: ("Must define exactly one of HIP_PLATFORM_AMD or HIP_PLATFORM_NVIDIA");
―
Reply to this email directly, view it on GitHub<#999 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADLPPACV2EK7W7T5PDT5BJT2ZEYPNAVCNFSM6AAAAAB2DLIEEWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEOBRGQ2DANA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
7975wx 32C
8x DDR5-5200 64GB
1x DCU K100AI
性能~2tps(对比4090D可以跑到10+)
…________________________________
发件人: BrainCH ***@***.***>
发送时间: 2025年4月16日 20:53
收件人: kvcache-ai/ktransformers ***@***.***>
抄送: dahema ***@***.***>; Author ***@***.***>
主题: Re: [kvcache-ai/ktransformers] Hygon DCU K100AI get UP but very SLOW (Discussion #999)
请问下您编译的是官网仓库还是south-ocean仓库的代码呀。以及请问下您设备的硬件配置与大模型性能具体是什么样的呢
―
Reply to this email directly, view it on GitHub<#999 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADLPPABSQAM4KRIOYKSKBDL2ZZHEZAVCNFSM6AAAAAB2DLIEEWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEOBVGQ2TMNQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
NVIDIA 30 系及之后有 marlin 算子支持, ROCm/HIP 和 DTK 暂无 marlin, 无法直接对比. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
AMD Ryzen Threadripper PRO 7975WX 32-Cores
DDR5-5200 64GB*8
海光DCU K100AI 64GB @400W
能跑,但是速度非常慢,期待优化
GPU占用率保持100%
以下是我的安装(折腾)过程
总体基于doc/en/rocm.md
全新安装ubuntu 22.04 server
安装cuda-tool-kit 11.7,这一步不能省略,有的指南似乎有些问题
安装dtk
安装pytorch/torchvision/torchaudio等厂家DCU预编译包
手工查看requirements.txt,如果有厂家预编译的优先使用
如果编译报错
是缺少 google gflags和google glog尝试
最后编译完成
pip show ktransformers
尝试运行一下DeepSeek-R1-Q4模型
提示缺少一些包
pip install openai pytest
海光DCU使用--optimize_config_path ktransformers/optimize/optimize_rules/rocm/DeepSeek-V3-Chat.yaml
可能碰到这个BUG https://github.com/kvcache-ai/ktransformers/issues/983,注释掉这一行
然后终于跑起来了
Beta Was this translation helpful? Give feedback.
All reactions