Skip to content

Commit 0ff908d

Browse files
committed
Release 0.6.0.
This fixes self-re-execution thanks to Open WebUI having merged open-webui/open-webui#5511. It also works around more permission issues due to procfs mounts. Docs updated. Fixes #11 Fixes #12 Updates #2 Updates #3
1 parent 65c2faf commit 0ff908d

File tree

4 files changed

+274
-60
lines changed

4 files changed

+274
-60
lines changed

.vscode/settings.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
"maxsplit",
2323
"memfd",
2424
"mountinfo",
25+
"mtab",
2526
"newcgroup",
2627
"NEWNS",
2728
"preexec",
@@ -39,6 +40,7 @@
3940
"subcontainers",
4041
"subfile",
4142
"subfiles",
43+
"submounts",
4244
"syscall",
4345
"UNSTARTED",
4446
"urandom",

docs/setup.md

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,10 @@ The below is the minimal subset of changes that `--privileged=true` does that is
5252
* On **Docker**: Add `--mount=type=bind,source=/sys/fs/cgroup,target=/sys/fs/cgroup,readonly=false` to `docker run`.
5353
* On **Kubernetes**: Add a [`hostPath` volume](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath) with `path` set to `/sys/fs/cgroup`, then mount it in your container's `volumeMounts` with options `mountPath` set to `/sys/fs/cgroup` and `readOnly` set to `false`.
5454
* **Why**: This is needed so that gVisor can create child [cgroups](https://en.wikipedia.org/wiki/Cgroups), necessary to enforce per-sandbox resource usage limits.
55+
* **Mount `procfs` at `/proc2`**:
56+
* On **Docker**: Add `--mount=type=bind,source=/proc,target=/proc2,readonly=false,bind-recursive=disabled` to `docker run`.
57+
* On **Kubernetes**: Add a [`hostPath` volume](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath) with `path` set to `/proc`, then mount it in your container's `volumeMounts` with options `mountPath` set to `/proc2` and `readOnly` set to `false`.
58+
* **Why**: By default, in non-privileged mode, the container runtime will mask certain sub-paths of `/proc` inside the container by creating submounts of `/proc` (e.g. `/proc/bus`, `/proc/sys`, etc.). gVisor does not really care or use anything under these sub-mounts, but *does* need to be able to mount `procfs` in the chroot environment it isolates itself in. However, its ability to mount `procfs` requires having an existing unobstructed view of `procfs` (i.e. a mount of `procfs` with no submounts). Otherwise, such mount attempts will be denied by the kernel (see the explanation for "locked" mounts on [`mount_namespaces(8)`](https://www.man7.org/linux/man-pages/man7/mount_namespaces.7.html)). Therefore, exposing an unobstructed (non-recursive) view of `/proc` elsewhere in the container filesystem (such as `/proc2`) informs the kernel that it is OK for this container to be able to mount `procfs`.
5559
* Remove the container's default **AppArmor profile**:
5660
* On **Docker**: Add `--security-opt=apparmor=unconfined` to `docker run`.
5761
* On **Kubernetes**: Set [`spec.securityContext.appArmorProfile.type`](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-apparmor-profile-for-a-container) to `Unconfined`.
@@ -66,20 +70,22 @@ The below is the minimal subset of changes that `--privileged=true` does that is
6670

6771
## Self-test mode
6872

69-
To verify that your setup works, you can run the tool in self-test mode using `run_code.py`'s `--use-sample-code` flag.
73+
To verify that your setup works, you can run the function and the tool in self-test mode using the `--self_test` flag.
7074

7175
For example, here is a Docker invocation running the `run_code.py` script inside the Open WebUI container image with the above flags:
7276

7377
```shell
7478
$ git clone https://github.com/EtiennePerot/open-webui-code-execution && \
79+
cd open-webui-code-execution && \
7580
docker run --rm \
7681
--security-opt=seccomp=unconfined \
7782
--security-opt=apparmor=unconfined \
7883
--security-opt=label=type:container_engine_t \
7984
--mount=type=bind,source=/sys/fs/cgroup,target=/sys/fs/cgroup,readonly=false \
80-
--mount=type=bind,source="$(pwd)/open-webui-code-execution",target=/selftest \
85+
--mount=type=bind,source=/proc,target=/proc2,readonly=false,bind-recursive=disabled \
86+
--mount=type=bind,source="$(pwd)",target=/test \
8187
ghcr.io/open-webui/open-webui:main \
82-
python3 /selftest/open-webui/tools/run_code.py --self_test
88+
sh -c 'python3 /test/open-webui/tools/run_code.py --self_test && python3 /test/open-webui/functions/run_code.py --self_test'
8389
```
8490

8591
If all goes well, you should see:
@@ -97,10 +103,12 @@ If all goes well, you should see:
97103
✔ Self-test long_running_code passed.
98104
⏳ Running self-test: ram_hog
99105
✔ Self-test ram_hog passed.
100-
✅ All self-tests passed, good go to!
106+
✅ All tool self-tests passed, good go to!
107+
...
108+
✅ All function self-tests passed, good go to!
101109
```
102110

103-
If you get an error, try to add the `--debug` flag at the very end of this command (i.e. as a `run_code.py` flag) for extra information, then file a bug.
111+
If you get an error, try to add the `--debug` to each `run_code.py` invocation for extra information, then file a bug.
104112

105113
## Set valves
106114

@@ -114,6 +122,7 @@ The code execution tool and function have the following valves available:
114122
* Useful for multi-user setups to avoid denial-of-service.
115123
* **Auto Install**: Whether to automatically download and install gVisor if not present in the container.
116124
* If not installed, gVisor will be automatically installed in `/tmp`.
125+
* You can set the HTTPS proxy used for this download using the `HTTPS_PROXY` environment variable.
117126
* Useful for convenience, but should be disabled for production setups.
118127
* **Debug**: Whether to produce debug logs.
119128
* This should never be enabled in production setups as it produces a lot of information that isn't necessary for regular use.

open-webui/functions/run_code.py

Lines changed: 130 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
author: EtiennePerot
66
author_url: https://github.com/EtiennePerot/open-webui-code-execution
77
funding_url: https://github.com/EtiennePerot/open-webui-code-execution
8-
version: 0.5.0
8+
version: 0.6.0
99
license: Apache-2.0
1010
"""
1111

@@ -35,6 +35,8 @@
3535
import asyncio
3636
import argparse
3737
import base64
38+
import ctypes
39+
import ctypes.util
3840
import contextlib
3941
import copy
4042
import fcntl
@@ -78,7 +80,7 @@ class Valves(pydantic.BaseModel):
7880
)
7981
AUTO_INSTALL: bool = pydantic.Field(
8082
default=True,
81-
description=f"Whether to automatically install gVisor if not installed on the system; may be overridden by environment variable {_VALVE_OVERRIDE_ENVIRONMENT_VARIABLE_NAME_PREFIX}AUTO_INSTALL.",
83+
description=f"Whether to automatically install gVisor if not installed on the system; may be overridden by environment variable {_VALVE_OVERRIDE_ENVIRONMENT_VARIABLE_NAME_PREFIX}AUTO_INSTALL. Use the 'HTTPS_PROXY' environment variable to control the proxy used for download.",
8284
)
8385
DEBUG: bool = pydantic.Field(
8486
default=False,
@@ -105,7 +107,7 @@ class Valves(pydantic.BaseModel):
105107
)
106108
WEB_ACCESSIBLE_DIRECTORY_URL: str = pydantic.Field(
107109
default="/cache/functions/run_code",
108-
description=f"URL corresponding to WEB_ACCESSIBLE_DIRECTORY_PATH. May start with '/' to make it relative to the Open WebUI serving domain. may be overridden by environment variable {_VALVE_OVERRIDE_ENVIRONMENT_VARIABLE_NAME_PREFIX}WEB_ACCESSIBLE_DIRECTORY_URL.",
110+
description=f"URL corresponding to WEB_ACCESSIBLE_DIRECTORY_PATH. May start with '/' to make it relative to the Open WebUI serving domain. May be overridden by environment variable {_VALVE_OVERRIDE_ENVIRONMENT_VARIABLE_NAME_PREFIX}WEB_ACCESSIBLE_DIRECTORY_URL.",
109111
)
110112

111113
def __init__(self, valves):
@@ -161,7 +163,6 @@ async def action(
161163

162164
async def _fail(error_message, status="SANDBOX_ERROR"):
163165
await emitter.fail(error_message)
164-
await emitter.code_execution_result(f"{status}: {error_message}")
165166
return json.dumps({"status": status, "output": error_message})
166167

167168
await emitter.status("Checking messages for code blocks...")
@@ -327,7 +328,6 @@ def _log(filename: str, log_line: str):
327328
print(f"[{filename}] {log_line}", file=sys.stderr)
328329

329330
sandbox.debug_logs(_log)
330-
await emitter.code_execution_result(output)
331331
if status == "OK":
332332
generated_files_output = ""
333333
if len(generated_files) > 0:
@@ -443,14 +443,6 @@ async def status(
443443
async def fail(self, description="Unknown error"):
444444
await self.status(description=description, status="error", done=True)
445445

446-
async def code_execution_result(self, output):
447-
await self._emit(
448-
"code_execution_result",
449-
{
450-
"output": output,
451-
},
452-
)
453-
454446
async def message(self, content):
455447
await self._emit(
456448
"message",
@@ -1097,12 +1089,57 @@ class Sandbox:
10971089
("id",),
10981090
("uname", "-a"),
10991091
("ls", "-l", "/proc/self/ns"),
1092+
("findmnt",),
11001093
(sys.executable, "--version"),
11011094
)
11021095

11031096
# Environment variable used to detect interpreter re-execution.
11041097
_MARKER_ENVIRONMENT_VARIABLE = "__CODE_EXECUTION_STAGE"
11051098

1099+
# Copy of this file's own contents, for re-execution.
1100+
# Must be populated at import time using `main`.
1101+
_SELF_FILE = None
1102+
1103+
# libc bindings.
1104+
# Populated using `_libc`.
1105+
_LIBC = None
1106+
1107+
class _Libc:
1108+
"""
1109+
Wrapper over libc functions.
1110+
"""
1111+
1112+
def __init__(self):
1113+
libc = ctypes.CDLL(ctypes.util.find_library("c"), use_errno=True)
1114+
libc.mount.argtypes = (ctypes.c_char_p,)
1115+
self._libc = libc
1116+
1117+
def mount(self, source, target, fs, options):
1118+
if (
1119+
self._libc.mount(
1120+
source.encode("ascii"),
1121+
target.encode("ascii"),
1122+
fs.encode("ascii"),
1123+
0,
1124+
options.encode("ascii"),
1125+
)
1126+
< 0
1127+
):
1128+
errno = ctypes.get_errno()
1129+
raise OSError(
1130+
errno,
1131+
f"mount({source}, {target}, {fs}, {options}): {os.strerror(errno)}",
1132+
)
1133+
1134+
def umount(self, path):
1135+
if self._libc.umount(path.encode("ascii")) < 0:
1136+
errno = ctypes.get_errno()
1137+
raise OSError(errno, f"umount({path}): {os.strerror(errno)}")
1138+
1139+
def unshare(self, flags):
1140+
if self._libc.unshare(flags) < 0:
1141+
raise OSError(f"unshare({flags}) failed")
1142+
11061143
class _Switcheroo:
11071144
"""
11081145
Management of the switcheroo procedure for running in a usable cgroup namespace and node.
@@ -1115,7 +1152,8 @@ class _Switcheroo:
11151152
_CGROUP_SUPERVISOR_NAME = "supervisor"
11161153
_CGROUP_LEAF = "leaf"
11171154

1118-
def __init__(self, log_path, max_sandbox_ram_bytes):
1155+
def __init__(self, libc, log_path, max_sandbox_ram_bytes):
1156+
self._libc = libc
11191157
self._log_path = log_path
11201158
self._max_sandbox_ram_bytes = max_sandbox_ram_bytes
11211159
self._my_euid = None
@@ -1457,7 +1495,9 @@ def _find_self_in_cgroup_hierarchy(self):
14571495
for dirpath, _, subfiles in os.walk(
14581496
self._CGROUP_ROOT, onerror=None, followlinks=False
14591497
):
1460-
if not dirpath.startswith(cgroup_root_slash):
1498+
if dirpath != self._CGROUP_ROOT and not dirpath.startswith(
1499+
cgroup_root_slash
1500+
):
14611501
continue
14621502
if "cgroup.procs" not in subfiles:
14631503
continue
@@ -1983,8 +2023,56 @@ def cgroups_available(cls) -> bool:
19832023
else:
19842024
return True
19852025

1986-
@staticmethod
1987-
def unshare(flags):
2026+
@classmethod
2027+
def check_procfs(cls):
2028+
"""
2029+
Verifies that we have an unobstructed view of procfs.
2030+
2031+
:return: Nothing.
2032+
:raises EnvironmentNeedsSetupException: If procfs is obstructed.
2033+
"""
2034+
mount_infos = []
2035+
with open("/proc/self/mountinfo", "rb") as mountinfo_f:
2036+
for line in mountinfo_f:
2037+
line = line.decode("utf-8").strip()
2038+
if not line:
2039+
continue
2040+
mount_components = line.split(" ")
2041+
if len(mount_components) != 10:
2042+
continue
2043+
hyphen_index = mount_components.index("-")
2044+
if hyphen_index < 6:
2045+
continue
2046+
mount_info = {
2047+
"mount_path": mount_components[4],
2048+
"path_within_mount": mount_components[3],
2049+
"fs_type": mount_components[hyphen_index + 1],
2050+
}
2051+
mount_infos.append(mount_info)
2052+
procfs_mounts = frozenset(
2053+
m["mount_path"]
2054+
for m in mount_infos
2055+
if m["fs_type"] == "proc" and m["path_within_mount"] == "/"
2056+
)
2057+
if len(procfs_mounts) == 0:
2058+
raise cls.EnvironmentNeedsSetupException(
2059+
"procfs is not mounted; please mount it"
2060+
)
2061+
obstructed_procfs_mounts = set()
2062+
for mount_info in mount_infos:
2063+
for procfs_mount in procfs_mounts:
2064+
if mount_info["mount_path"].startswith(procfs_mount + os.sep):
2065+
obstructed_procfs_mounts.add(procfs_mount)
2066+
for procfs_mount in procfs_mounts:
2067+
if procfs_mount not in obstructed_procfs_mounts:
2068+
return # We have at least one unobstructed procfs view.
2069+
assert len(obstructed_procfs_mounts) > 0, "Logic error"
2070+
raise cls.EnvironmentNeedsSetupException(
2071+
"procfs is obstructed; please mount a new procfs mount somewhere in the container, e.g. /proc2 (`--mount=type=bind,source=/proc,target=/proc2,readonly=false`)"
2072+
)
2073+
2074+
@classmethod
2075+
def unshare(cls, flags):
19882076
"""
19892077
Implementation of `os.unshare` that works on Python < 3.12.
19902078
@@ -1995,13 +2083,7 @@ def unshare(flags):
19952083
return os.unshare(flags)
19962084

19972085
# Python <= 3.11:
1998-
import ctypes
1999-
2000-
libc = ctypes.CDLL(None)
2001-
libc.unshare.argtypes = [ctypes.c_int]
2002-
rc = libc.unshare(flags)
2003-
if rc == -1:
2004-
raise OSError(f"unshare({flags}) failed")
2086+
return cls._libc().unshare(flags)
20052087

20062088
@classmethod
20072089
def check_unshare(cls):
@@ -2109,18 +2191,29 @@ def check_setup(cls, language: str, auto_install_allowed: bool):
21092191
cls.check_platform()
21102192
cls.check_unshare()
21112193
cls.check_cgroups()
2194+
cls.check_procfs()
21122195
if not auto_install_allowed and cls.get_runsc_path() is None:
21132196
raise cls.GVisorNotInstalledException(
21142197
"gVisor is not installed (runsc binary not found in $PATH); please install it or enable AUTO_INSTALL valve for auto installation"
21152198
)
21162199

21172200
@classmethod
2118-
def maybe_main(cls):
2201+
def _libc(cls):
2202+
if cls._LIBC is None:
2203+
cls._LIBC = cls._Libc()
2204+
return cls._LIBC
2205+
2206+
@classmethod
2207+
def main(cls):
21192208
"""
2120-
Entry-point for re-execution.
2209+
Entry-point for (re-)execution.
2210+
Populates `cls._SELF_FILE`, so must be called during import.
21212211
May call `sys.exit` if this is intended to be a code evaluation re-execution.
21222212
"""
2123-
if os.environ.get(cls._MARKER_ENVIRONMENT_VARIABLE) is None:
2213+
if cls._SELF_FILE is None:
2214+
with open(__file__, "r") as self_f:
2215+
cls._SELF_FILE = self_f.read()
2216+
if cls._MARKER_ENVIRONMENT_VARIABLE not in os.environ:
21242217
return
21252218
directives = json.load(sys.stdin)
21262219
try:
@@ -2214,6 +2307,7 @@ def _init(self, settings):
22142307
self._persistent_home_dir = self._settings["persistent_home_dir"]
22152308
self._sandboxed_command = None
22162309
self._switcheroo = self._Switcheroo(
2310+
libc=self._libc(),
22172311
log_path=os.path.join(self._logs_path, "switcheroo.txt"),
22182312
max_sandbox_ram_bytes=self._max_ram_bytes,
22192313
)
@@ -2242,6 +2336,7 @@ def _setup_sandbox(self):
22422336
raise self.SandboxException(
22432337
f"Persistent home directory {self._persistent_home_dir} does not exist"
22442338
)
2339+
oci_config["root"]["path"] = rootfs_path
22452340

22462341
try:
22472342
self._switcheroo.do()
@@ -2253,8 +2348,6 @@ def _setup_sandbox(self):
22532348
else:
22542349
raise e.__class__(f"{e}; {switcheroo_status}")
22552350

2256-
oci_config["root"]["path"] = rootfs_path
2257-
22582351
# Locate the interpreter to use.
22592352
interpreter_path = sys.executable
22602353
if self._language == self.LANGUAGE_BASH:
@@ -2512,12 +2605,15 @@ def run(self) -> subprocess.CompletedProcess:
25122605
:raises Sandbox.InterruptedExecutionError: If the code interpreter died without providing a return code; usually due to running over resource limits.
25132606
:raises sandbox.CodeExecutionError: If the code interpreter failed to execute the given code. This does not represent a sandbox failure.
25142607
"""
2608+
reexec_path = os.path.join(self._tmp_dir, "self.py")
2609+
with open(reexec_path, "w") as reexec_f:
2610+
reexec_f.write(self._SELF_FILE)
25152611
new_env = os.environ.copy()
25162612
new_env[self._MARKER_ENVIRONMENT_VARIABLE] = "1"
25172613
data = json.dumps({"settings": self._settings})
25182614
try:
25192615
result = subprocess.run(
2520-
(sys.executable, os.path.abspath(__file__)),
2616+
(sys.executable, reexec_path),
25212617
env=new_env,
25222618
input=data,
25232619
text=True,
@@ -2862,6 +2958,7 @@ def _verify():
28622958
"code": (f"head -c{64 * 1024 * 1024} /dev/urandom > random_data.bin",),
28632959
"valves": {
28642960
"MAX_MEGABYTES_PER_USER": 32,
2961+
"MAX_RAM_MEGABYTES": 2048,
28652962
},
28662963
"status": "STORAGE_ERROR",
28672964
"post": _want_user_storage_num_files(16),
@@ -3052,17 +3149,17 @@ def _print_output(obj):
30523149
else:
30533150
print(f"✔️ Self-test {name} passed.", file=sys.stderr)
30543151
if success:
3055-
print("✅ All self-tests passed, good go to!", file=sys.stderr)
3152+
print("✅ All function self-tests passed, good go to!", file=sys.stderr)
30563153
sys.exit(0)
30573154
else:
3058-
print("☠️ One or more self-tests failed.", file=sys.stderr)
3155+
print("☠️ One or more function self-tests failed.", file=sys.stderr)
30593156
sys.exit(1)
30603157
assert False, "Unreachable"
30613158

30623159

3160+
Sandbox.main()
30633161
# Debug utility: Run code from stdin if running as a normal Python script.
30643162
if __name__ == "__main__":
3065-
Sandbox.maybe_main()
30663163
parser = argparse.ArgumentParser(
30673164
description="Run arbitrary code in a gVisor sandbox."
30683165
)

0 commit comments

Comments
 (0)