Skip to content

[Bug]: Qwen3-GPTQ | Error in inspecting model architecture 'Qwen3MoeForCausalLM' #19504

Open
@hahmad2008

Description

@hahmad2008

Your current environment

VLLM v 0.9.0.1

🐛 Describe the bug

I am using docker image with VLLM v 0.9.0.1
I have download the model [Qwen/Qwen3-235B-A22B-GPTQ-Int4] at this directory qwen3-gptq:

I have a node with 8 H100 GPUs
VLLM_USE_V1=0 vllm serve qwen3-gptq --tensor-parallel-size 8 --max-model-len 32000 --gpu-memory-utilization 0.9 --distributed-executor-backend mp

I have this error

INFO 06-01 11:19:03 [__init__.py:243] Automatically detected platform cuda.
INFO 06-01 11:19:24 [__init__.py:31] Available plugins for group vllm.general_plugins:
INFO 06-01 11:19:24 [__init__.py:33] - lora_filesystem_resolver -> vllm.plugins.lora_resolvers.filesystem_resolver:register_filesystem_resolver
INFO 06-01 11:19:24 [__init__.py:36] All plugins in this group will be loaded. Set `VLLM_PLUGINS` to control which plugins to load.

init_-py:36] All plugins in this group will be loaded. Set "VLLM_PLUGINS' to control which plugins to load.
[registry-py: 363] Error in inspecting model architecture 'Qwen3MoeForCausalLM'
[registry-py: 363] Traceback (most recent call last):
[registry-py: 363] File */usr/local/lib/python3,10/dist-packages/vl]m/model_executor/models/registry-py", line 594, in _run
[registry-py: 363] returned. check_returncode(
[registry-py: 363] File "/usr/lib/python3.10/subprocess.py", line 457, in check_ returncode    
[registry-py: 363] raise CalledProcessError(self.returncode, self.args, self. stdout,
subprocess.CalledProcessError: Command '['/usr/bin/python3', '-m', *vllm.model_executor.models.registry')'

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/1ib/python3.10/dist-packages/v11m/model_executor/models/registry-py", line 361, in _try_i
return model. inspect_model_cls()
File "/usr/local/1ib/python3.10/dist-packages/v11m/model_executor/models/registry-py*, line 332, in inspec
return -run_in_subprocess(
File,"/usr/local/lib/python3.10/dist-packages/v11m/model_executor/models/registry-py", line 597, in run in _subprocess
raise RuntimeError (f"Error raised in subprocess: \n"
RuntimeError: Error raised in subprocess:
387 [registry-py:3631 /usr/lib/python3.10/runpy-py:126: RuntimeWarning: 'Vilm-model_executor.models registry' found in sys.modules after import of packag
ir.models.registry'; this may result in unpredictable behaviour
warn (RuntimeWarning(msg))
[registry-py:363] Traceback (most recent call last):
[registry-py:363] File "/usr/lib/python3,10/runpy-py", line 196, in _run_module_as_main
[registry-py: 363] return _run_ code(code, main_globals, None,
[registry-py: 363] File "/usr/lib/python3.10/runpy-py", line 86, in _run_code
registry-py:363] exec (code, run globals)
[registry-py:363] sr/loca1/1ib/python3.10/dist-packages/v11m/model_executor/models/registry-py*, Line 618, in ‹module
[registry-py:363] File /us/Local/13b/python3.10/dist-packages/v11m/model_ executor/models/registry-py*. Line 612, in _rum
File "/usr/lib/python3.10/importlib/.
_init_-- py"", line 126, in import module return
bootstrap. _god_import (name [level:], package, level)
File ‹ frozen importlib._bootstrap>", line 1050, in _ged_ import importlib. _bootstrap>", line 1027, in _find_and_load importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File importlib._bootstrap>", line 688, in _load_unlocked
File ‹frozen importlib._bootstrap_external>" , line 883, in exec_ module
File ‹ frozen importlib. _bootstrap>", line 241, in _call with_frames_ removed
File /usr/local/lib/python3.10/dist-packages/v11m/model_executor/models/qwen3_moe-py", line 37, in ‹module>
from vIlm. model _executon. layers.fused_moe import FusedMoE
File "/usr/1ocal/Iib/python3.10/dist-packages/v11m/mode1_executor/layers/fused_moe/_init_-py*, line 6, in ‹module>
from vllm.model_executor.layers.fused_moe.layer import (
File "/usr/local/lib/python3.10/dist-packages/v1lm/model
executor/layers/fused_moe/layer-py", line 34, in ‹module›
from .fused_batched moe import (BatchedPrepareAndFinalize,
File "/usr/local/1ib/python3.10/dist-packages/v1lm/model_executor/layers/fused_moe/fused_batched_moe.py", line 10, in ‹module> from v1lm.model_executor.layers.fused_moe.fused_moe import (
File */usr/local/1ib/python3.10/dist-packages/v11m/model_executor/layers/fused_moe/fused_moe-py", line 986, in ‹module> def grouped_topk(
File "/usr/local/lib/python3.10/dist-packages/torch/_init_•py", line 2543, in fn return compile(
File */usr/local/lib/python3,10/dist-packages/torch/_init_•py", line 2562, in compile from torch. inductor. compiler bisector import CompilerBisector
File /usr/local/1ib/python3.10/dist-packages/torch/_inductor/compiler_bisector.py", line 628, in ‹module›
CompilerBisector bisection_enabled = get is_bisection_enabled)
File */usr/local/lib/python3.10/dist-packages/torch/_inductor/compiler_bisector.py", line 623, in get is_bisection enabled
CompilerBisector get_ subsystem is not None
File */usr/local/lib/python3.10/dist-packages/torch/_inductor/compiler_bisector.py", line 207, in get subsystem
file_path = os.path.join(cls.get_dir(),
"bisect_status.txt")
File "/usr/local/1ib/python3.10/dist-packages/torch/_inductor/compiler_bisector-py", line 125, in get dir return f"(cache_dir() if not cls.in process_cache else cls.in_process_cache)/(SUBDIR_NAME)* File */us/local/lib/python3.10/dist-packages/torch/_inductor/runtime/cache_dir_utils-py", line 13, in cache dir
os. environ (TORCHINDUCTOR_CACHE DIR cache_din = default cache_dire)
File */us/local/lib/python3.10/dist-packages/torch/_inductor/runtime/cache_dir_utils-py", line 19, in default_ cache_dir sanitized username = re. sub(r' [\V/:**"»>|1'
*_*, getpass getuser())

sanitized username = re. sub(r'[M/:*?"<>|]' , getpass -getuser))
File "/usr/lib/python3.10/getpass.py", line 169, in getuser
    return pwd. getpwuid (os.getuid()) [e]

KeyError: 'getpwuid(): uid not found: 1001'

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    To triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions