Open
Description
Your current environment
VLLM v 0.9.0.1
🐛 Describe the bug
I am using docker image with VLLM v 0.9.0.1
I have download the model [Qwen/Qwen3-235B-A22B-GPTQ-Int4
] at this directory qwen3-gptq
:
I have a node with 8 H100 GPUs
VLLM_USE_V1=0 vllm serve qwen3-gptq --tensor-parallel-size 8 --max-model-len 32000 --gpu-memory-utilization 0.9 --distributed-executor-backend mp
I have this error
INFO 06-01 11:19:03 [__init__.py:243] Automatically detected platform cuda.
INFO 06-01 11:19:24 [__init__.py:31] Available plugins for group vllm.general_plugins:
INFO 06-01 11:19:24 [__init__.py:33] - lora_filesystem_resolver -> vllm.plugins.lora_resolvers.filesystem_resolver:register_filesystem_resolver
INFO 06-01 11:19:24 [__init__.py:36] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.
init_-py:36] All plugins in this group will be loaded. Set "VLLM_PLUGINS' to control which plugins to load.
[registry-py: 363] Error in inspecting model architecture 'Qwen3MoeForCausalLM'
[registry-py: 363] Traceback (most recent call last):
[registry-py: 363] File */usr/local/lib/python3,10/dist-packages/vl]m/model_executor/models/registry-py", line 594, in _run
[registry-py: 363] returned. check_returncode(
[registry-py: 363] File "/usr/lib/python3.10/subprocess.py", line 457, in check_ returncode
[registry-py: 363] raise CalledProcessError(self.returncode, self.args, self. stdout,
subprocess.CalledProcessError: Command '['/usr/bin/python3', '-m', *vllm.model_executor.models.registry')'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/1ib/python3.10/dist-packages/v11m/model_executor/models/registry-py", line 361, in _try_i
return model. inspect_model_cls()
File "/usr/local/1ib/python3.10/dist-packages/v11m/model_executor/models/registry-py*, line 332, in inspec
return -run_in_subprocess(
File,"/usr/local/lib/python3.10/dist-packages/v11m/model_executor/models/registry-py", line 597, in run in _subprocess
raise RuntimeError (f"Error raised in subprocess: \n"
RuntimeError: Error raised in subprocess:
387 [registry-py:3631 /usr/lib/python3.10/runpy-py:126: RuntimeWarning: 'Vilm-model_executor.models registry' found in sys.modules after import of packag
ir.models.registry'; this may result in unpredictable behaviour
warn (RuntimeWarning(msg))
[registry-py:363] Traceback (most recent call last):
[registry-py:363] File "/usr/lib/python3,10/runpy-py", line 196, in _run_module_as_main
[registry-py: 363] return _run_ code(code, main_globals, None,
[registry-py: 363] File "/usr/lib/python3.10/runpy-py", line 86, in _run_code
registry-py:363] exec (code, run globals)
[registry-py:363] sr/loca1/1ib/python3.10/dist-packages/v11m/model_executor/models/registry-py*, Line 618, in ‹module
[registry-py:363] File /us/Local/13b/python3.10/dist-packages/v11m/model_ executor/models/registry-py*. Line 612, in _rum
File "/usr/lib/python3.10/importlib/.
_init_-- py"", line 126, in import module return
bootstrap. _god_import (name [level:], package, level)
File ‹ frozen importlib._bootstrap>", line 1050, in _ged_ import importlib. _bootstrap>", line 1027, in _find_and_load importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File importlib._bootstrap>", line 688, in _load_unlocked
File ‹frozen importlib._bootstrap_external>" , line 883, in exec_ module
File ‹ frozen importlib. _bootstrap>", line 241, in _call with_frames_ removed
File /usr/local/lib/python3.10/dist-packages/v11m/model_executor/models/qwen3_moe-py", line 37, in ‹module>
from vIlm. model _executon. layers.fused_moe import FusedMoE
File "/usr/1ocal/Iib/python3.10/dist-packages/v11m/mode1_executor/layers/fused_moe/_init_-py*, line 6, in ‹module>
from vllm.model_executor.layers.fused_moe.layer import (
File "/usr/local/lib/python3.10/dist-packages/v1lm/model
executor/layers/fused_moe/layer-py", line 34, in ‹module›
from .fused_batched moe import (BatchedPrepareAndFinalize,
File "/usr/local/1ib/python3.10/dist-packages/v1lm/model_executor/layers/fused_moe/fused_batched_moe.py", line 10, in ‹module> from v1lm.model_executor.layers.fused_moe.fused_moe import (
File */usr/local/1ib/python3.10/dist-packages/v11m/model_executor/layers/fused_moe/fused_moe-py", line 986, in ‹module> def grouped_topk(
File "/usr/local/lib/python3.10/dist-packages/torch/_init_•py", line 2543, in fn return compile(
File */usr/local/lib/python3,10/dist-packages/torch/_init_•py", line 2562, in compile from torch. inductor. compiler bisector import CompilerBisector
File /usr/local/1ib/python3.10/dist-packages/torch/_inductor/compiler_bisector.py", line 628, in ‹module›
CompilerBisector bisection_enabled = get is_bisection_enabled)
File */usr/local/lib/python3.10/dist-packages/torch/_inductor/compiler_bisector.py", line 623, in get is_bisection enabled
CompilerBisector get_ subsystem is not None
File */usr/local/lib/python3.10/dist-packages/torch/_inductor/compiler_bisector.py", line 207, in get subsystem
file_path = os.path.join(cls.get_dir(),
"bisect_status.txt")
File "/usr/local/1ib/python3.10/dist-packages/torch/_inductor/compiler_bisector-py", line 125, in get dir return f"(cache_dir() if not cls.in process_cache else cls.in_process_cache)/(SUBDIR_NAME)* File */us/local/lib/python3.10/dist-packages/torch/_inductor/runtime/cache_dir_utils-py", line 13, in cache dir
os. environ (TORCHINDUCTOR_CACHE DIR cache_din = default cache_dire)
File */us/local/lib/python3.10/dist-packages/torch/_inductor/runtime/cache_dir_utils-py", line 19, in default_ cache_dir sanitized username = re. sub(r' [\V/:**"»>|1'
*_*, getpass getuser())
sanitized username = re. sub(r'[M/:*?"<>|]' , getpass -getuser))
File "/usr/lib/python3.10/getpass.py", line 169, in getuser
return pwd. getpwuid (os.getuid()) [e]
KeyError: 'getpwuid(): uid not found: 1001'
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.