Skip to content

Commit 58c4e47

Browse files
authored
Merge pull request #24 from imnotfancy/feat/strengthen-python-sandbox
feat: Strengthen PythonInterpreterTool sandboxing
2 parents 6eab0bb + f3d41e8 commit 58c4e47

File tree

4 files changed

+315
-14
lines changed

4 files changed

+315
-14
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ The system adopts a two-layer structure:
3232
- Hierarchical agent collaboration for complex and dynamic task scenarios
3333
- Extensible agent system, allowing easy integration of additional specialized agents
3434
- Automated information analysis, research, and web interaction capabilities
35+
- Secure Python code execution environment for tools, featuring configurable import controls, restricted built-ins, attribute access limitations, and resource limits. (See [PythonInterpreterTool Sandboxing](./docs/python_interpreter_sandbox.md) for details).
3536

3637

3738
## Updates

docs/python_interpreter_sandbox.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# PythonInterpreterTool Sandboxing
2+
3+
The `PythonInterpreterTool` allows the agent to execute Python code in a controlled environment. To ensure safety and prevent malicious or unintended actions, the tool employs several sandboxing mechanisms.
4+
5+
## Core Sandboxing Principles
6+
7+
1. **Custom AST Evaluator:** Instead of using Python's `eval()` or `exec()` directly on user code, the tool parses the code into an Abstract Syntax Tree (AST) and then walks through this tree, evaluating nodes one by one in a controlled manner. This allows fine-grained interception and control over what operations are permitted.
8+
9+
2. **Import Control:**
10+
* **Allowlist:** The tool uses an allowlist (`authorized_imports`) to specify which Python modules can be imported. Attempts to import modules not on this list will be blocked.
11+
* **Granular Control:** The allowlist can be configured to allow specific submodules (e.g., `os.path`) without allowing the entire parent module (e.g., `os`).
12+
* **Safe Module Copying:** When a module is imported, a "safe copy" is created. This process inspects the module and its submodules, pruning any parts that are not explicitly authorized, to prevent indirect importing of disallowed code.
13+
* **Configuration:** The `authorized_imports` list is configurable when an instance of `PythonInterpreterTool` is created. The default list includes common safe modules like `math`.
14+
15+
3. **Restricted Built-ins and Functions:**
16+
* Only a curated list of Python built-in functions (`BASE_PYTHON_TOOLS`) are available by default (e.g., `len()`, `str()`, `range()`, math functions).
17+
* Known dangerous functions (e.g., `eval`, `exec`, `open`, `os.system`, `subprocess.call`) are explicitly blacklisted (`DANGEROUS_FUNCTIONS`) and cannot be called even if they are part of an allowed module (defense in depth via the `safer_eval` mechanism).
18+
19+
4. **Attribute Access Control:**
20+
* Direct access to "dunder" attributes (e.g., `object.__dict__`, `function.__globals__`, `object.__subclasses__`) is blocked. This helps prevent introspection and manipulation of internal states that could lead to sandbox escapes.
21+
22+
5. **Resource Limits:**
23+
* The interpreter imposes limits on the number of operations (`MAX_OPERATIONS`) and loop iterations (`MAX_WHILE_ITERATIONS`) to prevent denial-of-service attacks through infinite loops or overly complex computations.
24+
25+
6. **Unsupported Operations:**
26+
* Python features that are complex to sandbox or pose security risks (e.g., `global`, `nonlocal` keywords, direct memory manipulation via `ctypes` unless explicitly allowed) are generally not supported by the custom AST evaluator and will result in errors.
27+
28+
## Security Considerations for Developers
29+
30+
* **Review `authorized_imports`:** When using the `PythonInterpreterTool`, carefully consider which modules are truly necessary for the intended tasks and restrict the `authorized_imports` list accordingly.
31+
* **Least Privilege:** Grant only the minimum necessary permissions. If a task only needs `math.sqrt`, consider if authorizing the entire `math` module is acceptable or if more granular control is needed (though typically, standard library modules like `math` are safe if the functions themselves are not dangerous).
32+
* **Tool Output:** Be mindful that the output of the executed code (both return values and print statements) could potentially contain sensitive information if the code handles such data.
33+
34+
By combining these mechanisms, the `PythonInterpreterTool` aims to provide a reasonably safe environment for executing Python code generated by LLMs or other sources, while still offering significant computational capabilities.

src/tools/executor/local_python_executor.py

Lines changed: 61 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,27 @@ def nodunder_getattr(obj, name, default=None):
129129
"socket",
130130
"subprocess",
131131
"sys",
132+
"ctypes",
133+
"fcntl",
134+
"grp",
135+
"pwd",
136+
# "resource", # Potentially too restrictive, can be added if specific issues arise
137+
"signal",
138+
# "syslog", # Application-specific, might be needed
139+
"termios",
140+
"tty",
141+
"select",
142+
"gc", # Added (can be used to inspect arbitrary objects)
143+
"_thread", # Added (low-level threading)
144+
"asyncio", # Added (can be used for I/O, networking, subprocesses)
145+
"marshal", # Added (can be used to create code objects)
146+
"msvcrt", # Added (Windows specific low-level routines)
147+
"pickle", # Added (can execute arbitrary code)
148+
"pipes", # Added (shell command pipelines)
149+
"posix", # Added (alias for os functions)
150+
"threading", # Added (while less dangerous than _thread, still needs caution)
151+
"wsgiref", # Added (can start web servers)
152+
"xmlrpc", # Added (can make network requests)
132153
]
133154

134155
DANGEROUS_FUNCTIONS = [
@@ -138,9 +159,29 @@ def nodunder_getattr(obj, name, default=None):
138159
"builtins.globals",
139160
"builtins.locals",
140161
"builtins.__import__",
162+
"builtins.open",
163+
"builtins.getattr",
164+
"builtins.setattr",
165+
"builtins.delattr",
166+
"builtins.vars",
141167
"os.popen",
142168
"os.system",
169+
"os.execl", "os.execle", "os.execlp", "os.execlpe", "os.execv", "os.execve", "os.execvp", "os.execvpe",
170+
"os.fork", "os.forkpty",
171+
"os.kill", "os.killpg",
172+
"os.plock",
173+
"os.putenv", "os.unsetenv",
174+
"os.spawnl", "os.spawnle", "os.spawnlp", "os.spawnlpe", "os.spawnv", "os.spawnve", "os.spawnvp", "os.spawnvpe",
143175
"posix.system",
176+
"subprocess.call", "subprocess.check_call", "subprocess.check_output", "subprocess.Popen", "subprocess.run",
177+
"sys.exit", "sys.gettrace", "sys.settrace", "sys.meta_path", "sys.path_hooks", "sys.path_importer_cache",
178+
"shutil.copy", "shutil.copy2", "shutil.copyfile", "shutil.copyfileobj", "shutil.copymode", "shutil.copystat", "shutil.copytree",
179+
"shutil.move", "shutil.rmtree",
180+
"socket.socket",
181+
"pickle.load", "pickle.loads",
182+
"ctypes.CDLL", "ctypes.PyDLL", "ctypes.WinDLL",
183+
"gc.get_objects", "gc.get_referrers", "gc.get_referents",
184+
# "object.__subclasses__", # Relies on nodunder_getattr
144185
]
145186

146187

@@ -233,14 +274,17 @@ def build_import_tree(authorized_imports: List[str]) -> Dict[str, Any]:
233274

234275

235276
def check_import_authorized(import_to_check: str, authorized_imports: list[str]) -> bool:
236-
current_node = build_import_tree(authorized_imports)
237-
for part in import_to_check.split("."):
277+
tree = build_import_tree(authorized_imports)
278+
current_node = tree
279+
parts = import_to_check.split(".")
280+
for i, part in enumerate(parts):
238281
if "*" in current_node:
239282
return True
240283
if part not in current_node:
241284
return False
242285
current_node = current_node[part]
243-
return True
286+
287+
return not current_node or "*" in current_node
244288

245289

246290
def safer_eval(func: Callable):
@@ -1097,39 +1141,42 @@ def evaluate_with(
10971141

10981142

10991143
def get_safe_module(raw_module, authorized_imports, visited=None):
1100-
"""Creates a safe copy of a module or returns the original if it's a function"""
1101-
# If it's a function or non-module object, return it directly
11021144
if not isinstance(raw_module, ModuleType):
11031145
return raw_module
11041146

1105-
# Handle circular references: Initialize visited set for the first call
11061147
if visited is None:
11071148
visited = set()
11081149

11091150
module_id = id(raw_module)
11101151
if module_id in visited:
1111-
return raw_module # Return original for circular refs
1152+
return raw_module
11121153

11131154
visited.add(module_id)
11141155

1115-
# Create new module for actual modules
1156+
# Check authorization for the module itself before proceeding
1157+
if not check_import_authorized(raw_module.__name__, authorized_imports):
1158+
raise InterpreterError(f"Import of module {raw_module.__name__} is not allowed.")
1159+
11161160
safe_module = ModuleType(raw_module.__name__)
11171161

1118-
# Copy all attributes by reference, recursively checking modules
11191162
for attr_name in dir(raw_module):
11201163
try:
11211164
attr_value = getattr(raw_module, attr_name)
11221165
except (ImportError, AttributeError) as e:
1123-
# lazy / dynamic loading module -> INFO log and skip
11241166
logger.info(
11251167
f"Skipping import error while copying {raw_module.__name__}.{attr_name}: {type(e).__name__} - {e}"
11261168
)
11271169
continue
1128-
# Recursively process nested modules, passing visited set
1129-
if isinstance(attr_value, ModuleType):
1130-
attr_value = get_safe_module(attr_value, authorized_imports, visited=visited)
11311170

1132-
setattr(safe_module, attr_name, attr_value)
1171+
if isinstance(attr_value, ModuleType):
1172+
submodule_full_name = f"{raw_module.__name__}.{attr_name}"
1173+
# Only add authorized submodules
1174+
if check_import_authorized(submodule_full_name, authorized_imports):
1175+
processed_attr_value = get_safe_module(attr_value, authorized_imports, visited=visited)
1176+
setattr(safe_module, attr_name, processed_attr_value)
1177+
# Else: unauthorized submodule, so we don't add it to safe_module
1178+
else:
1179+
setattr(safe_module, attr_name, attr_value)
11331180

11341181
return safe_module
11351182

tests/test_local_python_executor.py

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
import unittest
2+
from src.tools.executor.local_python_executor import evaluate_python_code, InterpreterError, BASE_PYTHON_TOOLS, BASE_BUILTIN_MODULES, DEFAULT_MAX_LEN_OUTPUT
3+
4+
# It's good practice to define a small, fixed list for default authorized_imports in tests
5+
# unless a test specifically needs to modify it.
6+
TEST_DEFAULT_AUTHORIZED_IMPORTS = ["math"] # Example, can be empty if preferred for stricter tests
7+
8+
class TestPythonInterpreterSandbox(unittest.TestCase):
9+
10+
def setUp(self):
11+
# These are defaults for the tools/state available during evaluation.
12+
# Tests can override state or custom_tools if needed.
13+
self.static_tools = BASE_PYTHON_TOOLS.copy()
14+
self.custom_tools = {}
15+
# self.state is not defined here, as evaluate_python_code takes state as an argument
16+
# and it's better to pass a fresh state for each test call to avoid interference.
17+
18+
def _evaluate(self, code, authorized_imports=None, state=None):
19+
if authorized_imports is None:
20+
authorized_imports = list(TEST_DEFAULT_AUTHORIZED_IMPORTS) # Use a copy
21+
22+
current_state = state if state is not None else {}
23+
24+
# evaluate_python_code returns (result, is_final_answer)
25+
return evaluate_python_code(
26+
code,
27+
static_tools=self.static_tools,
28+
custom_tools=self.custom_tools, # Pass along self.custom_tools
29+
state=current_state, # Pass along current_state
30+
authorized_imports=authorized_imports,
31+
max_print_outputs_length=DEFAULT_MAX_LEN_OUTPUT
32+
)
33+
34+
# === Import Tests ===
35+
def test_import_disallowed_module_direct(self):
36+
with self.assertRaisesRegex(InterpreterError, "Import of os is not allowed"):
37+
self._evaluate("import os", authorized_imports=[])
38+
39+
def test_import_disallowed_module_from(self):
40+
with self.assertRaisesRegex(InterpreterError, "Import from os is not allowed"):
41+
self._evaluate("from os import path", authorized_imports=[])
42+
43+
def test_import_allowed_module(self):
44+
result, _ = self._evaluate("import math; x = math.sqrt(4)", authorized_imports=["math"])
45+
self.assertEqual(result, 2.0)
46+
47+
def test_import_submodule_allowed_implicitly(self):
48+
# If 'collections' is allowed, 'collections.abc' should be usable via attribute access.
49+
# The get_safe_module ensures submodules are also checked if they were explicitly imported.
50+
# This test checks if 'collections.abc' can be accessed if 'collections' is authorized.
51+
# The updated get_safe_module will try to check 'collections.abc' when 'collections' is processed.
52+
# So, 'collections.abc' must also be in authorized_imports or match a wildcard like 'collections.*'
53+
# For this test, let's authorize both specifically.
54+
result, _ = self._evaluate("import collections; c = collections.abc.Callable", authorized_imports=["collections", "collections.abc"])
55+
# Check that 'c' is indeed the Callable type from collections.abc
56+
import collections.abc as abc_module
57+
self.assertIs(result, abc_module.Callable)
58+
59+
60+
def test_import_only_specific_submodule_denies_parent_access(self):
61+
# Allow "os.path" but try to use "os.listdir()" -> should fail on "os" not being fully allowed for that.
62+
# This tests the precision of check_import_authorized.
63+
# If only "os.path" is authorized, "import os" should fail.
64+
with self.assertRaisesRegex(InterpreterError, "Import of os is not allowed"):
65+
self._evaluate("import os; os.listdir('.')", authorized_imports=["os.path"])
66+
67+
def test_import_authorized_submodule_directly(self):
68+
result, _ = self._evaluate("import os.path; x = os.path.basename('/a/b')", authorized_imports=["os.path"])
69+
self.assertEqual(result, "b")
70+
71+
def test_import_from_authorized_submodule(self):
72+
result, _ = self._evaluate("from os.path import basename; x = basename('/a/b')", authorized_imports=["os.path"])
73+
self.assertEqual(result, "b")
74+
75+
# === Dangerous Function Call Tests ===
76+
def test_call_dangerous_builtin_function_eval(self):
77+
with self.assertRaisesRegex(InterpreterError, "Forbidden access to function: eval"):
78+
self._evaluate("eval('1+1')")
79+
80+
def test_call_dangerous_builtin_function_exec(self):
81+
with self.assertRaisesRegex(InterpreterError, "Forbidden access to function: exec"):
82+
self._evaluate("exec('a=1')")
83+
84+
def test_call_dangerous_os_function_system_via_import(self):
85+
# This relies on 'os' module itself being blocked from import.
86+
with self.assertRaisesRegex(InterpreterError, "Import of os is not allowed"):
87+
self._evaluate("import os; os.system('echo hello')")
88+
89+
def test_call_dangerous_function_if_module_was_somehow_allowed(self):
90+
# If 'os' was authorized, safer_eval should still block 'os.system' if it's in DANGEROUS_FUNCTIONS
91+
# This tests the defense in depth of safer_eval.
92+
with self.assertRaisesRegex(InterpreterError, "Forbidden access to function: system"):
93+
self._evaluate("import os; os.system('echo hello')", authorized_imports=["os"])
94+
95+
96+
def test_call_allowed_builtin_function(self):
97+
result, _ = self._evaluate("len([1,2,3])")
98+
self.assertEqual(result, 3)
99+
100+
def test_call_function_returned_by_tool_if_dangerous(self):
101+
# Mocking state to contain a dangerous function
102+
current_state = {"my_dangerous_func": eval}
103+
with self.assertRaisesRegex(InterpreterError, "Forbidden access to function: eval"):
104+
self._evaluate("my_dangerous_func('1+1')", state=current_state, authorized_imports=[])
105+
106+
107+
# === Dunder Attribute Access Tests ===
108+
def test_access_disallowed_dunder_directly_on_dict(self):
109+
with self.assertRaisesRegex(InterpreterError, "Forbidden access to dunder attribute: __dict__"):
110+
self._evaluate("x = {}; x.__dict__")
111+
112+
def test_access_disallowed_dunder_directly_on_module(self):
113+
# math.__loader__ is an example.
114+
with self.assertRaisesRegex(InterpreterError, "Forbidden access to dunder attribute: __loader__"):
115+
self._evaluate("import math; math.__loader__", authorized_imports=["math"])
116+
117+
118+
def test_access_disallowed_dunder_via_getattr(self):
119+
# getattr is nodunder_getattr in BASE_PYTHON_TOOLS
120+
with self.assertRaisesRegex(InterpreterError, "Forbidden access to dunder attribute: __subclasses__"):
121+
self._evaluate("x = type(0); getattr(x, '__subclasses__')")
122+
123+
def test_allowed_dunder_method_indirectly_len(self):
124+
result, _ = self._evaluate("x = [1,2]; len(x)")
125+
self.assertEqual(result, 2)
126+
127+
def test_allowed_dunder_method_indirectly_getitem(self):
128+
result, _ = self._evaluate("x = [10,20]; x[1]")
129+
self.assertEqual(result, 20)
130+
131+
# === AST Node Behavior Tests ===
132+
def test_assign_to_static_tool_name_blocked(self):
133+
with self.assertRaisesRegex(InterpreterError, "Cannot assign to name 'len'"):
134+
self._evaluate("len = lambda x: x")
135+
136+
def test_lambda_executes_in_sandbox_blocks_import(self):
137+
with self.assertRaisesRegex(InterpreterError, "Import of sys is not allowed"):
138+
self._evaluate("f = lambda: __import__('sys'); f()", authorized_imports=[])
139+
140+
def test_def_function_executes_in_sandbox_blocks_import(self):
141+
code = """
142+
def my_func():
143+
import shutil # Disallowed
144+
return shutil.disk_usage('.')
145+
my_func()
146+
"""
147+
with self.assertRaisesRegex(InterpreterError, "Import of shutil is not allowed"):
148+
self._evaluate(code, authorized_imports=[])
149+
150+
def test_class_def_executes_in_sandbox_blocks_import_in_init(self):
151+
code = """
152+
class MyClass:
153+
def __init__(self):
154+
import subprocess # Disallowed
155+
self.name = subprocess.call('echo')
156+
def get_name(self):
157+
return self.name
158+
x = MyClass()
159+
x.get_name()
160+
"""
161+
with self.assertRaisesRegex(InterpreterError, "Import of subprocess is not allowed"):
162+
self._evaluate(code, authorized_imports=[])
163+
164+
def test_class_def_executes_in_sandbox_blocks_import_in_method(self):
165+
code = """
166+
class MyClassMethod:
167+
def do_bad_stuff(self):
168+
import _thread # Disallowed
169+
return _thread.get_ident()
170+
x = MyClassMethod()
171+
x.do_bad_stuff()
172+
"""
173+
with self.assertRaisesRegex(InterpreterError, "Import of _thread is not allowed"):
174+
self._evaluate(code, authorized_imports=[])
175+
176+
def test_unsupported_ast_node_global_keyword(self):
177+
code = """
178+
x = 0
179+
def f():
180+
global x # ast.Global node
181+
x = 1
182+
"""
183+
with self.assertRaisesRegex(InterpreterError, "Global is not supported"):
184+
self._evaluate(code)
185+
186+
def test_unsupported_ast_node_nonlocal_keyword(self):
187+
code = """
188+
def f():
189+
x = 1
190+
def g():
191+
nonlocal x # ast.Nonlocal node
192+
x = 2
193+
g()
194+
"""
195+
with self.assertRaisesRegex(InterpreterError, "Nonlocal is not supported"):
196+
self._evaluate(code)
197+
198+
def test_comprehension_sandbox_import(self):
199+
with self.assertRaisesRegex(InterpreterError, "Import of os is not allowed"):
200+
self._evaluate("[__import__('os') for i in range(1)]", authorized_imports=[])
201+
202+
def test_try_except_sandbox_import(self):
203+
code = """
204+
try:
205+
x = 1
206+
except Exception:
207+
import os
208+
else:
209+
import sys
210+
finally:
211+
import subprocess
212+
"""
213+
# The first import attempt (os) should be caught.
214+
with self.assertRaisesRegex(InterpreterError, "Import of os is not allowed"):
215+
self._evaluate(code, authorized_imports=[])
216+
217+
218+
if __name__ == "__main__":
219+
unittest.main()

0 commit comments

Comments
 (0)