Skip to content

Commit b773862

Browse files
authored
Merge branch 'main' into main
2 parents a6e2f6a + 58c4e47 commit b773862

File tree

4 files changed

+319
-17
lines changed

4 files changed

+319
-17
lines changed

README.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,11 @@ The system adopts a two-layer structure:
3737

3838
## Features
3939

40-
* Hierarchical agent collaboration for complex and dynamic task scenarios
41-
* Extensible agent system, allowing easy integration of additional specialized agents
42-
* Automated information analysis, research, and web interaction capabilities
40+
- Hierarchical agent collaboration for complex and dynamic task scenarios
41+
- Extensible agent system, allowing easy integration of additional specialized agents
42+
- Automated information analysis, research, and web interaction capabilities
43+
- Secure Python code execution environment for tools, featuring configurable import controls, restricted built-ins, attribute access limitations, and resource limits. (See [PythonInterpreterTool Sandboxing](./docs/python_interpreter_sandbox.md) for details).
44+
4345

4446
## Updates
4547

docs/python_interpreter_sandbox.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# PythonInterpreterTool Sandboxing
2+
3+
The `PythonInterpreterTool` allows the agent to execute Python code in a controlled environment. To ensure safety and prevent malicious or unintended actions, the tool employs several sandboxing mechanisms.
4+
5+
## Core Sandboxing Principles
6+
7+
1. **Custom AST Evaluator:** Instead of using Python's `eval()` or `exec()` directly on user code, the tool parses the code into an Abstract Syntax Tree (AST) and then walks through this tree, evaluating nodes one by one in a controlled manner. This allows fine-grained interception and control over what operations are permitted.
8+
9+
2. **Import Control:**
10+
* **Allowlist:** The tool uses an allowlist (`authorized_imports`) to specify which Python modules can be imported. Attempts to import modules not on this list will be blocked.
11+
* **Granular Control:** The allowlist can be configured to allow specific submodules (e.g., `os.path`) without allowing the entire parent module (e.g., `os`).
12+
* **Safe Module Copying:** When a module is imported, a "safe copy" is created. This process inspects the module and its submodules, pruning any parts that are not explicitly authorized, to prevent indirect importing of disallowed code.
13+
* **Configuration:** The `authorized_imports` list is configurable when an instance of `PythonInterpreterTool` is created. The default list includes common safe modules like `math`.
14+
15+
3. **Restricted Built-ins and Functions:**
16+
* Only a curated list of Python built-in functions (`BASE_PYTHON_TOOLS`) are available by default (e.g., `len()`, `str()`, `range()`, math functions).
17+
* Known dangerous functions (e.g., `eval`, `exec`, `open`, `os.system`, `subprocess.call`) are explicitly blacklisted (`DANGEROUS_FUNCTIONS`) and cannot be called even if they are part of an allowed module (defense in depth via the `safer_eval` mechanism).
18+
19+
4. **Attribute Access Control:**
20+
* Direct access to "dunder" attributes (e.g., `object.__dict__`, `function.__globals__`, `object.__subclasses__`) is blocked. This helps prevent introspection and manipulation of internal states that could lead to sandbox escapes.
21+
22+
5. **Resource Limits:**
23+
* The interpreter imposes limits on the number of operations (`MAX_OPERATIONS`) and loop iterations (`MAX_WHILE_ITERATIONS`) to prevent denial-of-service attacks through infinite loops or overly complex computations.
24+
25+
6. **Unsupported Operations:**
26+
* Python features that are complex to sandbox or pose security risks (e.g., `global`, `nonlocal` keywords, direct memory manipulation via `ctypes` unless explicitly allowed) are generally not supported by the custom AST evaluator and will result in errors.
27+
28+
## Security Considerations for Developers
29+
30+
* **Review `authorized_imports`:** When using the `PythonInterpreterTool`, carefully consider which modules are truly necessary for the intended tasks and restrict the `authorized_imports` list accordingly.
31+
* **Least Privilege:** Grant only the minimum necessary permissions. If a task only needs `math.sqrt`, consider if authorizing the entire `math` module is acceptable or if more granular control is needed (though typically, standard library modules like `math` are safe if the functions themselves are not dangerous).
32+
* **Tool Output:** Be mindful that the output of the executed code (both return values and print statements) could potentially contain sensitive information if the code handles such data.
33+
34+
By combining these mechanisms, the `PythonInterpreterTool` aims to provide a reasonably safe environment for executing Python code generated by LLMs or other sources, while still offering significant computational capabilities.

src/tools/executor/local_python_executor.py

Lines changed: 61 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,27 @@ def nodunder_getattr(obj, name, default=None):
129129
"socket",
130130
"subprocess",
131131
"sys",
132+
"ctypes",
133+
"fcntl",
134+
"grp",
135+
"pwd",
136+
# "resource", # Potentially too restrictive, can be added if specific issues arise
137+
"signal",
138+
# "syslog", # Application-specific, might be needed
139+
"termios",
140+
"tty",
141+
"select",
142+
"gc", # Added (can be used to inspect arbitrary objects)
143+
"_thread", # Added (low-level threading)
144+
"asyncio", # Added (can be used for I/O, networking, subprocesses)
145+
"marshal", # Added (can be used to create code objects)
146+
"msvcrt", # Added (Windows specific low-level routines)
147+
"pickle", # Added (can execute arbitrary code)
148+
"pipes", # Added (shell command pipelines)
149+
"posix", # Added (alias for os functions)
150+
"threading", # Added (while less dangerous than _thread, still needs caution)
151+
"wsgiref", # Added (can start web servers)
152+
"xmlrpc", # Added (can make network requests)
132153
]
133154

134155
DANGEROUS_FUNCTIONS = [
@@ -138,9 +159,29 @@ def nodunder_getattr(obj, name, default=None):
138159
"builtins.globals",
139160
"builtins.locals",
140161
"builtins.__import__",
162+
"builtins.open",
163+
"builtins.getattr",
164+
"builtins.setattr",
165+
"builtins.delattr",
166+
"builtins.vars",
141167
"os.popen",
142168
"os.system",
169+
"os.execl", "os.execle", "os.execlp", "os.execlpe", "os.execv", "os.execve", "os.execvp", "os.execvpe",
170+
"os.fork", "os.forkpty",
171+
"os.kill", "os.killpg",
172+
"os.plock",
173+
"os.putenv", "os.unsetenv",
174+
"os.spawnl", "os.spawnle", "os.spawnlp", "os.spawnlpe", "os.spawnv", "os.spawnve", "os.spawnvp", "os.spawnvpe",
143175
"posix.system",
176+
"subprocess.call", "subprocess.check_call", "subprocess.check_output", "subprocess.Popen", "subprocess.run",
177+
"sys.exit", "sys.gettrace", "sys.settrace", "sys.meta_path", "sys.path_hooks", "sys.path_importer_cache",
178+
"shutil.copy", "shutil.copy2", "shutil.copyfile", "shutil.copyfileobj", "shutil.copymode", "shutil.copystat", "shutil.copytree",
179+
"shutil.move", "shutil.rmtree",
180+
"socket.socket",
181+
"pickle.load", "pickle.loads",
182+
"ctypes.CDLL", "ctypes.PyDLL", "ctypes.WinDLL",
183+
"gc.get_objects", "gc.get_referrers", "gc.get_referents",
184+
# "object.__subclasses__", # Relies on nodunder_getattr
144185
]
145186

146187

@@ -233,14 +274,17 @@ def build_import_tree(authorized_imports: List[str]) -> Dict[str, Any]:
233274

234275

235276
def check_import_authorized(import_to_check: str, authorized_imports: list[str]) -> bool:
236-
current_node = build_import_tree(authorized_imports)
237-
for part in import_to_check.split("."):
277+
tree = build_import_tree(authorized_imports)
278+
current_node = tree
279+
parts = import_to_check.split(".")
280+
for i, part in enumerate(parts):
238281
if "*" in current_node:
239282
return True
240283
if part not in current_node:
241284
return False
242285
current_node = current_node[part]
243-
return True
286+
287+
return not current_node or "*" in current_node
244288

245289

246290
def safer_eval(func: Callable):
@@ -1097,39 +1141,42 @@ def evaluate_with(
10971141

10981142

10991143
def get_safe_module(raw_module, authorized_imports, visited=None):
1100-
"""Creates a safe copy of a module or returns the original if it's a function"""
1101-
# If it's a function or non-module object, return it directly
11021144
if not isinstance(raw_module, ModuleType):
11031145
return raw_module
11041146

1105-
# Handle circular references: Initialize visited set for the first call
11061147
if visited is None:
11071148
visited = set()
11081149

11091150
module_id = id(raw_module)
11101151
if module_id in visited:
1111-
return raw_module # Return original for circular refs
1152+
return raw_module
11121153

11131154
visited.add(module_id)
11141155

1115-
# Create new module for actual modules
1156+
# Check authorization for the module itself before proceeding
1157+
if not check_import_authorized(raw_module.__name__, authorized_imports):
1158+
raise InterpreterError(f"Import of module {raw_module.__name__} is not allowed.")
1159+
11161160
safe_module = ModuleType(raw_module.__name__)
11171161

1118-
# Copy all attributes by reference, recursively checking modules
11191162
for attr_name in dir(raw_module):
11201163
try:
11211164
attr_value = getattr(raw_module, attr_name)
11221165
except (ImportError, AttributeError) as e:
1123-
# lazy / dynamic loading module -> INFO log and skip
11241166
logger.info(
11251167
f"Skipping import error while copying {raw_module.__name__}.{attr_name}: {type(e).__name__} - {e}"
11261168
)
11271169
continue
1128-
# Recursively process nested modules, passing visited set
1129-
if isinstance(attr_value, ModuleType):
1130-
attr_value = get_safe_module(attr_value, authorized_imports, visited=visited)
11311170

1132-
setattr(safe_module, attr_name, attr_value)
1171+
if isinstance(attr_value, ModuleType):
1172+
submodule_full_name = f"{raw_module.__name__}.{attr_name}"
1173+
# Only add authorized submodules
1174+
if check_import_authorized(submodule_full_name, authorized_imports):
1175+
processed_attr_value = get_safe_module(attr_value, authorized_imports, visited=visited)
1176+
setattr(safe_module, attr_name, processed_attr_value)
1177+
# Else: unauthorized submodule, so we don't add it to safe_module
1178+
else:
1179+
setattr(safe_module, attr_name, attr_value)
11331180

11341181
return safe_module
11351182

0 commit comments

Comments
 (0)