Skip to content

ENH: Worker env isolation #3362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 12, 2025
Merged

Conversation

codingl2k1
Copy link
Contributor

@XprobeBot XprobeBot added the enhancement New feature or request label Apr 30, 2025
@XprobeBot XprobeBot added this to the v1.x milestone Apr 30, 2025
@codingl2k1 codingl2k1 marked this pull request as ready for review May 6, 2025 19:33
@qinxuye
Copy link
Contributor

qinxuye commented May 7, 2025

When I launch model qwen2.5-instruct I encounter an error.

0:28887, pid=32428] [Errno 2] No such file or directory: '/Users/xuyeqin/.xinference/virtualenv/qwen2.5-instruct/bin/python'
Traceback (most recent call last):
  File "/Users/xuyeqin/Workspace/inference/xinference/api/restful_api.py", line 1023, in launch_model
    model_uid = await (await self._get_supervisor_ref()).launch_builtin_model(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/context.py", line 262, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/context.py", line 111, in _process_result_message
    raise message.as_instanceof_cause()
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/pool.py", line 689, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/pool.py", line 389, in _run_coro
    return await coro
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 564, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 527, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 532, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/inference/xinference/core/supervisor.py", line 1199, in launch_builtin_model
    await _launch_model()
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/inference/xinference/core/supervisor.py", line 1134, in _launch_model
    subpool_address = await _launch_one_model(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/inference/xinference/core/supervisor.py", line 1088, in _launch_one_model
    subpool_address = await worker_ref.launch_builtin_model(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/context.py", line 262, in send
    return self._process_result_message(result)
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/context.py", line 111, in _process_result_message
    raise message.as_instanceof_cause()
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/pool.py", line 689, in send
    result = await self._run_coro(message.message_id, coro)
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/pool.py", line 389, in _run_coro
    return await coro
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 564, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 527, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    ^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 532, in xoscar.core._BaseActor.__on_receive__
    result = await result
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/inference/xinference/core/utils.py", line 93, in wrapped
    ret = await func(*args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/inference/xinference/core/worker.py", line 1011, in launch_builtin_model
    subpool_address, devices = await self._create_subpool(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/inference/xinference/core/worker.py", line 613, in _create_subpool
    subpool_address = await self._main_pool.append_sub_pool(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/indigen/pool.py", line 371, in append_sub_pool
    process, external_addresses = await self._create_sub_pool_from_parent(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/indigen/pool.py", line 281, in _create_sub_pool_from_parent
    process = await create_subprocess_exec(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/Workspace/xoscar/python/xoscar/backends/indigen/fate_sharing.py", line 206, in create_subprocess_exec
    process: asyncio.subprocess.Process = await asyncio.create_subprocess_exec(
      ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/miniconda3/lib/python3.11/asyncio/subprocess.py", line 223, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
      ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/miniconda3/lib/python3.11/asyncio/base_events.py", line 1708, in subprocess_exec
    transport = await self._make_subprocess_transport(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/miniconda3/lib/python3.11/asyncio/unix_events.py", line 207, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/miniconda3/lib/python3.11/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/miniconda3/lib/python3.11/asyncio/unix_events.py", line 818, in _start
    self._proc = subprocess.Popen(
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/miniconda3/lib/python3.11/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
    ^^^^^^^^^^^^^^^^^
  File "/Users/xuyeqin/miniconda3/lib/python3.11/subprocess.py", line 1953, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
    ^^^^^^^^^^^^^^^^^
FileNotFoundError: [address=0.0.0.0:28887, pid=32428] [Errno 2] No such file or directory: '/Users/xuyeqin/.xinference/virtualenv/qwen2.5-instruct/bin/python'

Actually this model did not configure virtualenv thus this env does not exist.

Copy link
Contributor

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye force-pushed the enh/worker_env_isolation branch from 4a88d29 to a728151 Compare May 12, 2025 08:18
@qinxuye qinxuye merged commit edb7bb3 into xorbitsai:main May 12, 2025
9 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants