-
Notifications
You must be signed in to change notification settings - Fork 84
move main pyscript execution to callback server and added and fixed test cases for the same #1933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…est cases for the same
WalkthroughThis pull request introduces a major refactor to the callback and pyscript execution infrastructure. The monolithic Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Router as /main_pyscript/execute-python
participant CallbackUtility
participant PyscriptSharedUtility
participant CallbackScriptUtility
Client->>Router: POST /main_pyscript/execute-python (PyscriptPayload)
Router->>CallbackUtility: main_pyscript_handler({source_code, predefined_objects}, None)
CallbackUtility->>PyscriptSharedUtility: (data CRUD, schedule job, etc.)
CallbackUtility->>CallbackScriptUtility: (email, encryption, callback creation, PDF, etc.)
CallbackUtility-->>Router: result (success/failure)
Router-->>Client: JSON response (success, output or error)
sequenceDiagram
participant SharedActionsUtils
participant AsyncCallbackService as /main_pyscript/execute-python
participant CallbackUtility
SharedActionsUtils->>AsyncCallbackService: POST (source_code, predefined_objects)
AsyncCallbackService->>CallbackUtility: main_pyscript_handler(...)
CallbackUtility-->>AsyncCallbackService: result
AsyncCallbackService-->>SharedActionsUtils: JSON response
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🧹 Nitpick comments (11)
kairon/shared/pyscript/callback_pyscript_utils.py (3)
198-213
: Logging entirelocals()
risks leaking PII & large payloads
perform_cleanup
logs the rawlocal_vars
dict (line 199). When this helper is used inside user-provided pyscripts, it can accidentally serialize:
- Access-tokens / credentials present in the sandbox
- Large binary blobs (e.g. pandas DataFrames → log explosion)
Consider removing the first
logger.info(f"local_vars: ...")
or restrict it to debug builds.
263-289
: Useraise … from err
for better tracebacksAll three
except
blocks (lines 263-265, 287-289, 317-318) re-raise a newException(...)
, discarding the original stack. Ruff’s B904 hint recommends chaining the exception:-except Exception as e: - raise Exception(f"decryption failed-{str(e)}") +except Exception as e: + raise Exception(f"decryption failed: {e}") from eRepeat the pattern for encryption & PDF-save helpers.
🧰 Tools
🪛 Ruff (0.8.2)
264-264: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
288-288: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
308-318
: Incorrect error message & exception chaining
- The catch-all message says
"encryption failed"
while the method is saving a PDF.- Same traceback-chaining issue noted above.
- except Exception as e: - raise Exception(f"encryption failed-{str(e)}") + except Exception as e: + raise Exception(f"Failed to save markdown as PDF: {e}") from e🧰 Tools
🪛 Ruff (0.8.2)
318-318: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
kairon/async_callback/router/pyscript_callback.py (1)
1-2
: Remove unused imports.There are some unused imports that should be removed.
Apply this change:
-from typing import Optional, Dict, Any +from typing import Optional -from blacksheep import Router, Request, Response as BSResponse, TextContent, json +from blacksheep import Router, Request, Response as BSResponse, json🧰 Tools
🪛 Ruff (0.8.2)
1-1:
typing.Dict
imported but unusedRemove unused import
(F401)
1-1:
typing.Any
imported but unusedRemove unused import
(F401)
2-2:
blacksheep.TextContent
imported but unusedRemove unused import:
blacksheep.TextContent
(F401)
kairon/shared/pyscript/shared_pyscript_utils.py (3)
1-9
: Remove unused imports to keep the module lean
Dict
,Callable
, andList
are imported but never referenced, andUtility
is imported from the package root while every other module useskairon.shared.utils.Utility
. Cleaning this up avoids RuffF401
warnings and keeps the import path consistent.-from typing import Text, Dict, Callable, List +from typing import Text, Dict -from kairon import Utility +from kairon.shared.utils import Utility🧰 Tools
🪛 Ruff (0.8.2)
1-1:
typing.Dict
imported but unusedRemove unused import
(F401)
1-1:
typing.Callable
imported but unusedRemove unused import
(F401)
1-1:
typing.List
imported but unusedRemove unused import
(F401)
30-43
: Type hints & exception granularity
get_data
(and siblings) raise plainException
, losing the semantic benefit of custom exception classes already present in the codebase (AppException
). Switching toAppException
(or a new domain-specific error) improves debuggability and keeps error handling uniform.- if not bot: - raise Exception("Missing bot id") + if not bot: + raise AppException("Missing bot id")Repeat the same pattern for
add_data
,update_data
,delete_data
.
82-100
: Minor: uppercase HTTP verb may break internal comparison
ActionUtility.execute_http_request
lower-cases the verb before comparison with
{'post', 'put', 'delete'}
. Passing"DELETE"
still works, but converting
early keeps the call site explicit and avoids future surprises if the helper
removes thelower()
call.- f"{event_server}/api/events/{event_id}", - "DELETE") + f"{event_server}/api/events/{event_id}", + "delete")tests/unit_test/callback/pyscript_handler_test.py (2)
1428-1434
: Duplicatedummy_uuid7
definition shadowing
dummy_uuid7
is defined twice (lines 1330 and 1430).
The second definition (F811
Ruff warning) silently overwrites the first and can
confuse readers or future refactors.Please keep a single definition or give them distinct, meaningful names.
🧰 Tools
🪛 Ruff (0.8.2)
1430-1430: Redefinition of unused
dummy_uuid7
from line 1330(F811)
1476-1480
:result
is assigned but never usedThe local variable
result
(line 1478) is only created to hold the return value
ofadd_schedule_job
and is not referenced afterwards – RuffF841
.If the assertion on the DB insert suffices, drop the assignment:
- result = CallbackScriptUility.add_schedule_job(...) + CallbackScriptUility.add_schedule_job(...)🧰 Tools
🪛 Ruff (0.8.2)
1478-1478: Local variable
result
is assigned to but never usedRemove assignment to unused variable
result
(F841)
kairon/async_callback/utils.py (2)
18-19
: Typo in imported class name will propagate throughout the codebase
CallbackScriptUility
is missing a “t” (“Utility”). Apart from hurting readability, this makes grep-based discovery harder and raises the barrier for new contributors. If the misspelling is accidental, please rename the class inkairon/shared/pyscript/callback_pyscript_utils.py
and adjust all imports; otherwise consider adding an alias inside that module to avoid surprises.
115-128
:execute_main_pyscript
duplicates logic and omits safety wrappersThe new method re-implements
execute_script
with only two effective changes:
• No helper injection
• Same 60 s timeoutThis duplication will diverge over time. Consider DRYing it up:
-def execute_main_pyscript(source_code: Text, predefined_objects: Dict = None): - ... - script_variables = ActorOrchestrator.run( - ActorType.pyscript_runner.value, source_code=source_code, timeout=60, - predefined_objects=predefined_objects - ) - return script_variables +def execute_main_pyscript(source_code: Text, predefined_objects: Dict | None = None): + # Execute with a *minimal* sandbox: reuse execute_script with an empty helper set + return CallbackUtility.execute_script(source_code, predefined_objects or {})Besides reducing code, you automatically inherit the argument validation (once added per previous comment).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
kairon/actions/definitions/pyscript.py
(1 hunks)kairon/async_callback/router/pyscript_callback.py
(2 hunks)kairon/async_callback/utils.py
(3 hunks)kairon/shared/actions/utils.py
(1 hunks)kairon/shared/callback/data_objects.py
(2 hunks)kairon/shared/concurrency/actors/pyscript_runner.py
(4 hunks)kairon/shared/pyscript/callback_pyscript_utils.py
(1 hunks)kairon/shared/pyscript/shared_pyscript_utils.py
(1 hunks)system.yaml
(1 hunks)tests/integration_test/callback_service_test.py
(1 hunks)tests/unit_test/callback/pyscript_handler_test.py
(33 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
kairon/shared/actions/utils.py (1)
kairon/shared/utils.py (2)
Utility
(92-2277)execute_http_request
(1685-1781)
🪛 Ruff (0.8.2)
kairon/async_callback/router/pyscript_callback.py
1-1: typing.Dict
imported but unused
Remove unused import
(F401)
1-1: typing.Any
imported but unused
Remove unused import
(F401)
2-2: blacksheep.TextContent
imported but unused
Remove unused import: blacksheep.TextContent
(F401)
kairon/shared/pyscript/shared_pyscript_utils.py
1-1: typing.Dict
imported but unused
Remove unused import
(F401)
1-1: typing.Callable
imported but unused
Remove unused import
(F401)
1-1: typing.List
imported but unused
Remove unused import
(F401)
kairon/shared/pyscript/callback_pyscript_utils.py
264-264: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
288-288: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
318-318: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
tests/unit_test/callback/pyscript_handler_test.py
1430-1430: Redefinition of unused dummy_uuid7
from line 1330
(F811)
1478-1478: Local variable result
is assigned to but never used
Remove assignment to unused variable result
(F841)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Analyze (python)
- GitHub Check: Python CI
🔇 Additional comments (11)
system.yaml (1)
265-268
: Ensure consistency and fall-back handling for new URL configurationA new environment variable
TRIGGER_MAIN_PYSCRIPT_URL
is introduced (line 267). Double-check:
- All deployment environments define this variable, otherwise the default
http://localhost:5059/main_pyscript/execute-python
will be used in production.- The consumer code (
ActionUtility.run_pyscript
, async-callback router, etc.) gracefully handles mis-configurations (e.g. empty string, unreachable host) with meaningful error messages and retries if required.If you want a quick sanity-check across the codebase, consider adding a small validation step during start-up that warns when the URL still points to localhost.
kairon/actions/definitions/pyscript.py (1)
69-70
: Good catch – preserves default dispatch typeThe fallback to
dispatch_type
when'type'
is missing preventsNone
from propagating tohandle_utter_bot_response
. This small change averts silent runtime errors. 👍kairon/shared/callback/data_objects.py (1)
441-446
: LGTM! New data model for python script execution.The new
PyscriptPayload
Pydantic model provides a structured schema for payload validation when executing restricted Python scripts, with proper typing for both required source code and optional predefined objects.kairon/shared/concurrency/actors/pyscript_runner.py (4)
40-41
: Method dependencies updated to use new utility class.The import has been correctly updated to use the newly introduced
PyscriptSharedUtility
class instead of the monolithicCallbackUtility
.
49-53
: Refactored to use PyscriptSharedUtility for helper methods.The global safe dictionary now correctly uses the methods from
PyscriptSharedUtility
instead ofCallbackUtility
, maintaining the same functionality while following the new architecture.
60-60
: Function signature updated to include global_safe parameter.The
execute_script_with_timeout
function now accepts theglobal_safe
dictionary as a parameter, which is necessary for the updated execution flow.
75-75
: Updated to pass global_safe to execute_script_with_timeout.The function call now correctly passes the
global_safe
dictionary to the execution function, ensuring that the restricted Python environment has access to the necessary safe methods.kairon/async_callback/router/pyscript_callback.py (2)
92-93
: LGTM! Improved route handler for standalone action execution.The proper request method and parameters are passed to the process_router_message function.
94-102
: Well-structured new endpoint for Python script execution.The new endpoint correctly uses the PyscriptPayload model for request validation and handles both success and error cases appropriately.
tests/integration_test/callback_service_test.py (2)
616-637
: LGTM! Comprehensive test for successful script execution.This test verifies that the new endpoint correctly handles successful Python script execution, including:
- Proper mocking of the handler method
- Verification of response status
- Validation of success flag and output content
639-659
: LGTM! Comprehensive test for error handling.This test thoroughly validates the error handling path when script execution fails:
- Proper mocking of an exception case
- Verification of error status code (500)
- Validation that success flag is properly set to false
response_dict = { | ||
"decryptedBody": decrypted_data, | ||
"aesKeyBuffer": aes_key, | ||
"initialVectorBuffer": iv, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning raw AES key defeats the purpose of end-to-end encryption
decrypt_request
includes "aesKeyBuffer": aes_key
in the return (line 257). Down-stream code that accidentally logs or re-serialises this dict will leak the symmetric key, allowing message replay/decryption.
Expose only the decrypted payload & IV; keep the key strictly in memory:
- response_dict = {
- "decryptedBody": decrypted_data,
- "aesKeyBuffer": aes_key,
- "initialVectorBuffer": iv,
- }
+ response_dict = {
+ "decryptedBody": decrypted_data,
+ "_aes_key": aes_key, # leading underscore -> internal use
+ "_iv": iv,
+ }
…and ensure callers never log or return the underscored fields.
Committable suggestion skipped: line range outside the PR's diff.
@staticmethod | ||
def add_schedule_job(schedule_action: Text, date_time: datetime, data: Dict, timezone: Text, _id: Text = None, | ||
bot: Text = None, kwargs=None): | ||
if not bot: | ||
raise AppException("Missing bot id") | ||
|
||
if not _id: | ||
_id = uuid7().hex | ||
|
||
if not data: | ||
data = {} | ||
|
||
data['bot'] = bot | ||
data['event'] = _id | ||
|
||
callback_config = CallbackConfig.get_entry(bot=bot, name=schedule_action) | ||
|
||
script = callback_config.get('pyscript_code') | ||
|
||
func = obj_to_ref(ExecutorFactory.get_executor().execute_task) | ||
|
||
schedule_data = { | ||
'source_code': script, | ||
'predefined_objects': data | ||
} | ||
|
||
args = (func, "scheduler_evaluator", schedule_data,) | ||
kwargs = {'task_type': "Callback"} if kwargs is None else {**kwargs, 'task_type': "Callback"} | ||
trigger = DateTrigger(run_date=date_time, timezone=timezone) | ||
|
||
next_run_time = trigger.get_next_fire_time(None, datetime.now(astimezone(timezone) or get_localzone())) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timezone handling may raise at runtime
astimezone(timezone)
(line 83) expects a tzinfo
object, but a plain string can be passed to add_schedule_job
.
If a caller provides "Asia/Kolkata"
(str) you’ll get a ValueError: Invalid tzinfo argument
.
Safest option:
-from apscheduler.util import obj_to_ref, astimezone
+from apscheduler.util import obj_to_ref
+import pytz
...
- next_run_time = trigger.get_next_fire_time(None, datetime.now(astimezone(timezone) or get_localzone()))
+ tz = pytz.timezone(timezone) if isinstance(timezone, str) else timezone
+ next_run_time = trigger.get_next_fire_time(None, datetime.now(tz or get_localzone()))
Add pytz
to requirements if not present.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@staticmethod | |
def add_schedule_job(schedule_action: Text, date_time: datetime, data: Dict, timezone: Text, _id: Text = None, | |
bot: Text = None, kwargs=None): | |
if not bot: | |
raise AppException("Missing bot id") | |
if not _id: | |
_id = uuid7().hex | |
if not data: | |
data = {} | |
data['bot'] = bot | |
data['event'] = _id | |
callback_config = CallbackConfig.get_entry(bot=bot, name=schedule_action) | |
script = callback_config.get('pyscript_code') | |
func = obj_to_ref(ExecutorFactory.get_executor().execute_task) | |
schedule_data = { | |
'source_code': script, | |
'predefined_objects': data | |
} | |
args = (func, "scheduler_evaluator", schedule_data,) | |
kwargs = {'task_type': "Callback"} if kwargs is None else {**kwargs, 'task_type': "Callback"} | |
trigger = DateTrigger(run_date=date_time, timezone=timezone) | |
next_run_time = trigger.get_next_fire_time(None, datetime.now(astimezone(timezone) or get_localzone())) | |
# at top of file, replace this: | |
-from apscheduler.util import obj_to_ref, astimezone | |
+# remove astimezone and add pytz | |
+from apscheduler.util import obj_to_ref | |
+import pytz | |
@staticmethod | |
def add_schedule_job(schedule_action: Text, | |
date_time: datetime, | |
data: Dict, | |
timezone: Text, | |
_id: Text = None, | |
bot: Text = None, | |
kwargs=None): | |
if not bot: | |
raise AppException("Missing bot id") | |
if not _id: | |
_id = uuid7().hex | |
if not data: | |
data = {} | |
data['bot'] = bot | |
data['event'] = _id | |
callback_config = CallbackConfig.get_entry(bot=bot, name=schedule_action) | |
script = callback_config.get('pyscript_code') | |
func = obj_to_ref(ExecutorFactory.get_executor().execute_task) | |
schedule_data = { | |
'source_code': script, | |
'predefined_objects': data | |
} | |
args = (func, "scheduler_evaluator", schedule_data,) | |
kwargs = {'task_type': "Callback"} if kwargs is None else {**kwargs, 'task_type': "Callback"} | |
trigger = DateTrigger(run_date=date_time, timezone=timezone) | |
# convert a string timezone into a tzinfo, or pass through if already one | |
tz = pytz.timezone(timezone) if isinstance(timezone, str) else timezone | |
next_run_time = trigger.get_next_fire_time( | |
None, | |
datetime.now(tz or get_localzone()) | |
) |
callback_url=Utility.environment['async_callback_action']['pyscript']['url'] | ||
resp = Utility.execute_http_request( | ||
"POST", | ||
http_url=callback_url, | ||
request_body={ | ||
"source_code": source_code, | ||
"predefined_objects": context | ||
}, | ||
headers={"Content-Type": "application/json"} | ||
) | ||
if resp.get('statusCode') != 200: | ||
raise ActionFailure(f'Pyscript evaluation failed: {resp}') | ||
result = resp.get('data') | ||
result = resp.get('body') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider adding stronger validation & error-handling around the callback response
The new branch correctly swaps the synchronous evaluator call for an async-callback invocation, but a couple of corner-cases are left unhandled:
-
resp.get("body")
can be either a JSON string or an already-parsed dict (depending on how the callback service serialises the payload).
Down-stream (evaluate_pyscript
) expects a dict and will break if it receives an unparsed string. -
Only the HTTP 200 status code is validated. A 200 with
"success": False
(or missing fields) will silently flow through and surface later as obscureKeyError
s.
A small defensive layer eliminates both issues:
- if resp.get('statusCode') != 200:
- raise ActionFailure(f'Pyscript evaluation failed: {resp}')
- result = resp.get('body')
+ if resp.get("statusCode") != 200:
+ raise ActionFailure(f"Pyscript evaluation failed: {resp}")
+
+ # Parse `body` if it is a JSON string
+ result = resp.get("body")
+ if isinstance(result, str):
+ try:
+ result = json.loads(result)
+ except (TypeError, ValueError):
+ raise ActionFailure(f"Unable to parse callback body: {result!r}")
+
+ if not isinstance(result, dict):
+ raise ActionFailure(f"Unexpected callback body type: {type(result).__name__}")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
callback_url=Utility.environment['async_callback_action']['pyscript']['url'] | |
resp = Utility.execute_http_request( | |
"POST", | |
http_url=callback_url, | |
request_body={ | |
"source_code": source_code, | |
"predefined_objects": context | |
}, | |
headers={"Content-Type": "application/json"} | |
) | |
if resp.get('statusCode') != 200: | |
raise ActionFailure(f'Pyscript evaluation failed: {resp}') | |
result = resp.get('data') | |
result = resp.get('body') | |
callback_url=Utility.environment['async_callback_action']['pyscript']['url'] | |
resp = Utility.execute_http_request( | |
"POST", | |
http_url=callback_url, | |
request_body={ | |
"source_code": source_code, | |
"predefined_objects": context | |
}, | |
headers={"Content-Type": "application/json"} | |
) | |
if resp.get("statusCode") != 200: | |
raise ActionFailure(f"Pyscript evaluation failed: {resp}") | |
# Parse `body` if it is a JSON string | |
result = resp.get("body") | |
if isinstance(result, str): | |
try: | |
result = json.loads(result) | |
except (TypeError, ValueError): | |
raise ActionFailure(f"Unable to parse callback body: {result!r}") | |
if not isinstance(result, dict): | |
raise ActionFailure(f"Unexpected callback body type: {type(result).__name__}") |
kairon/async_callback/utils.py
Outdated
predefined_objects['add_schedule_job'] = partial(CallbackScriptUility.add_schedule_job, bot=bot) | ||
predefined_objects['delete_schedule_job'] = partial(PyscriptSharedUtility.delete_schedule_job, bot=bot) | ||
predefined_objects['send_email'] = partial(CallbackScriptUility.send_email, bot=bot) | ||
predefined_objects['add_data'] = partial(PyscriptSharedUtility.add_data, bot=bot) | ||
predefined_objects['get_data'] = partial(PyscriptSharedUtility.get_data, bot=bot) | ||
predefined_objects['delete_data'] = partial(PyscriptSharedUtility.delete_data, bot=bot) | ||
predefined_objects['update_data'] = partial(PyscriptSharedUtility.update_data, bot=bot) | ||
predefined_objects["generate_id"] = CallbackScriptUility.generate_id | ||
predefined_objects["datetime_to_utc_timestamp"]=CallbackScriptUility.datetime_to_utc_timestamp | ||
predefined_objects['decrypt_request'] = CallbackScriptUility.decrypt_request | ||
predefined_objects['encrypt_response'] = CallbackScriptUility.encrypt_response | ||
predefined_objects['create_callback'] = partial(CallbackScriptUility.create_callback, | ||
bot=bot, | ||
sender_id=sender_id, | ||
channel=channel) | ||
predefined_objects['save_as_pdf'] = partial(CallbackUtility.save_as_pdf,bot=bot, sender_id=sender_id) | ||
|
||
predefined_objects['save_as_pdf'] = partial(CallbackScriptUility.save_as_pdf, | ||
bot=bot, sender_id=sender_id) | ||
script_variables = ActorOrchestrator.run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Partial functions silently capture bot=None
and may surface as hard-to-trace errors
bot
, channel
, and sender_id
are fetched from predefined_objects
before you overwrite the dict.
If any of those keys are missing, their value becomes None
, yet the partials are still created:
predefined_objects['add_schedule_job'] = partial(CallbackScriptUility.add_schedule_job, bot=bot)
Down-stream utility calls that expect a valid bot
id will then fail much later with confusing NoneType
messages.
+if bot is None:
+ raise ValueError("predefined_objects must contain a non-empty 'bot' key")
In the same block you also expose powerful helpers (requests
, decrypt_request
, encrypt_response
, etc.) to user scripts.
Consider the security implications: a malicious pyscript could perform unrestricted HTTP calls or decrypt arbitrary payloads.
At minimum:
- Provide an allow-list of domains for
requests
. - Strip secrets from the log lines (see next comment).
- Audit the encryption helpers so the key material is never exposed to the sandbox.
kairon/async_callback/utils.py
Outdated
@staticmethod | ||
def decrypt_request(request_body, private_key_pem): | ||
try: | ||
encrypted_data_b64 = request_body.get("encrypted_flow_data") | ||
encrypted_aes_key_b64 = request_body.get("encrypted_aes_key") | ||
iv_b64 = request_body.get("initial_vector") | ||
|
||
if not (encrypted_data_b64 and encrypted_aes_key_b64 and iv_b64): | ||
raise ValueError("Missing required encrypted data fields") | ||
|
||
# Decode base64 inputs | ||
encrypted_aes_key = base64.b64decode(encrypted_aes_key_b64) | ||
encrypted_data = base64.b64decode(encrypted_data_b64) | ||
iv = base64.b64decode(iv_b64)[:16] # Ensure IV is exactly 16 bytes | ||
|
||
private_key = load_pem_private_key(private_key_pem.encode(), password=None) | ||
|
||
# Decrypt AES key using RSA and OAEP padding | ||
aes_key = private_key.decrypt( | ||
encrypted_aes_key, | ||
asym_padding.OAEP( | ||
mgf=asym_padding.MGF1(algorithm=hashes.SHA256()), | ||
algorithm=hashes.SHA256(), | ||
label=None, | ||
), | ||
) | ||
|
||
if len(aes_key) not in (16, 24, 32): | ||
raise ValueError(f"Invalid AES key size: {len(aes_key)} bytes") | ||
|
||
# Extract GCM tag (last 16 bytes) | ||
encrypted_body = encrypted_data[:-16] | ||
tag = encrypted_data[-16:] | ||
|
||
# Decrypt AES-GCM | ||
cipher = Cipher(algorithms.AES(aes_key), modes.GCM(iv, tag)) | ||
decryptor = cipher.decryptor() | ||
decrypted_bytes = decryptor.update(encrypted_body) + decryptor.finalize() | ||
decrypted_data = jsond.loads(decrypted_bytes.decode("utf-8")) | ||
|
||
response_dict = { | ||
"decryptedBody": decrypted_data, | ||
"aesKeyBuffer": aes_key, | ||
"initialVectorBuffer": iv, | ||
} | ||
|
||
return response_dict | ||
|
||
except Exception as e: | ||
raise Exception(f"decryption failed-{str(e)}") | ||
|
||
@staticmethod | ||
def encrypt_response(response_body, aes_key_buffer, initial_vector_buffer): | ||
data = event | ||
if isinstance(data, list): | ||
data = {item['name'].lower(): item['value'] for item in data} | ||
try: | ||
if aes_key_buffer is None: | ||
raise ValueError("AES key cannot be None") | ||
|
||
if initial_vector_buffer is None: | ||
raise ValueError("Initialization vector (IV) cannot be None") | ||
|
||
# Flip the IV | ||
flipped_iv = bytes(byte ^ 0xFF for byte in initial_vector_buffer) | ||
|
||
# Encrypt using AES-GCM | ||
encryptor = Cipher(algorithms.AES(aes_key_buffer), modes.GCM(flipped_iv)).encryptor() | ||
encrypted_bytes = encryptor.update(jsond.dumps(response_body).encode("utf-8")) + encryptor.finalize() | ||
encrypted_data_with_tag = encrypted_bytes + encryptor.tag | ||
|
||
# Encode result as base64 | ||
encoded_data = base64.b64encode(encrypted_data_with_tag).decode("utf-8") | ||
return encoded_data | ||
response = CallbackUtility.execute_main_pyscript(data['source_code'], data.get('predefined_objects')) | ||
output["body"] = response | ||
except Exception as e: | ||
raise Exception(f"encryption failed-{str(e)}") | ||
|
||
logger.exception(e) | ||
output["statusCode"] = 422 | ||
output["body"] = str(e) | ||
|
||
@staticmethod | ||
def create_callback(callback_name: str, metadata: dict, bot: str, sender_id: str, channel: str, name: str='callback_pyscript'): | ||
callback_url, identifier, standalone = CallbackData.create_entry( | ||
name=name, | ||
callback_config_name=callback_name, | ||
bot=bot, | ||
sender_id=sender_id, | ||
channel=channel, | ||
metadata=metadata, | ||
) | ||
if standalone: | ||
return identifier | ||
else: | ||
return callback_url | ||
logger.info(output) | ||
return output | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handler duplicates pyscript_handler
and logs sensitive data
main_pyscript_handler
is almost identical to pyscript_handler
. Extract a private helper (_dispatch_handler) that accepts the executor function as an argument to eliminate copy-paste.
More importantly, logger.info(output)
can leak the entire script result—including decrypted payloads or PII—into application logs. Recommend downgrading to logger.debug
and redacting large or sensitive fields:
-output = { ... }
-...
-logger.info(output)
+sanitised = {**output, "body": "REDACTED" if output["statusCode"] == 200 else output["body"]}
+logger.debug("main_pyscript_handler response: {}", sanitised)
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
kairon/async_callback/router/pyscript_callback.py (2)
2-2
: Remove unused importTextContent
The static analysis tool has correctly identified that
TextContent
is imported but never used in this file. Remove it to keep your imports clean.- from blacksheep import Router, Request, Response as BSResponse, TextContent, json + from blacksheep import Router, Request, Response as BSResponse, json🧰 Tools
🪛 Ruff (0.8.2)
2-2:
blacksheep.TextContent
imported but unusedRemove unused import:
blacksheep.TextContent
(F401)
94-103
: Good implementation of the new pyscript execution endpointYou've successfully added a well-structured endpoint to handle the pyscript execution on the callback server, aligning with the PR objectives. The handler correctly processes the payload and returns appropriate responses.
A few observations:
Error responses return status 500 for all exceptions. Consider using more specific status codes for different error types (e.g., 400 for validation errors, 422 for execution errors).
The success response uses
**result
to include all fields from the result. Make sureresult
is always a dictionary and doesn't contain a key named "success" to avoid conflicts.Consider enhancing the error handling with more specific status codes:
@router.post("/main_pyscript/execute-python") async def trigger_restricted_python(payload: PyscriptPayload): try: result = CallbackUtility.main_pyscript_handler({ "source_code": payload.source_code, "predefined_objects": payload.predefined_objects or {} }, None) return {"success": True, **result} except ValueError as e: return json({"success": False, "error": str(e)}, status=400) except AppException as e: return json({"success": False, "error": str(e)}, status=422) except Exception as e: return json({"success": False, "error": str(e)}, status=500)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
kairon/async_callback/router/pyscript_callback.py
(2 hunks)kairon/shared/pyscript/shared_pyscript_utils.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- kairon/shared/pyscript/shared_pyscript_utils.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
kairon/async_callback/router/pyscript_callback.py (3)
kairon/async_callback/processor.py (2)
async_callback
(85-122)CallbackProcessor
(19-192)kairon/async_callback/utils.py (2)
CallbackUtility
(31-176)main_pyscript_handler
(130-152)kairon/shared/callback/data_objects.py (1)
PyscriptPayload
(441-446)
🪛 Ruff (0.8.2)
kairon/async_callback/router/pyscript_callback.py
2-2: blacksheep.TextContent
imported but unused
Remove unused import: blacksheep.TextContent
(F401)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Analyze (python)
- GitHub Check: Python CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
tests/integration_test/action_service_test.py (4)
2521-2522
: Remove
print()
noise in tests pollutes CI logs and hinders parallel execution.
Use logging with an appropriate level if the information is genuinely useful, or drop them altogether.- print(events) - ... - print(response_json) + # debugging prints removed – rely on pytest failure diff or logging if neededAlso applies to: 2636-2637, 2649-2650
730-736
: Heavy duplication when mocking the callback endpointThe same
responses.add(... Utility.environment['async_callback_action']['pyscript']['url'] ...)
block (with minor JSON dict changes) is repeated across >10 tests.
Extract a helper to build the stub to keep tests DRY and improve maintainability:def stub_pyscript_callback(responses_mock, body, status_code=200, success=True): responses_mock.add( "POST", Utility.environment['async_callback_action']['pyscript']['url'], json={ "success": success, "body": body, "statusCode": status_code, "message": None, "error_code": 0, }, status=status_code, )Then call
stub_pyscript_callback(responses, {...})
in each test.Also applies to: 820-826, 904-910
2908-2910
: Patch-in-context pattern hides failures & bloats testsMultiple tests wrap large request payloads in a nested
with patch(...compose_response...)
block.
Instead, use a fixture that auto-applies the patch so every test benefits and the patch is undone automatically:@pytest.fixture def compose_resp_stub(monkeypatch): monkeypatch.setattr( "kairon.shared.actions.utils.ActionUtility.compose_response", lambda *_: ({"dummy": "resp"}, ["log"], 0.01), )
2522-2523
: Asserting giant dicts makes tests brittleThe equality assertion of the full
events
list will break as soon as a timestamp format, ordering, or an unrelated log field changes.
Prefer selective assertions:assert events[0]["type"] == "response" assert events[1]["type"] == "api_call" assert any(e["type"] == "filled_slots" for e in events)This keeps the intent clear and reduces maintenance overhead.
Also applies to: 2985-2990
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
kairon/shared/actions/utils.py
(1 hunks)tests/integration_test/action_service_test.py
(36 hunks)tests/testing_data/system.yaml
(1 hunks)tests/unit_test/action/action_test.py
(19 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/testing_data/system.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- kairon/shared/actions/utils.py
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/unit_test/action/action_test.py
1438-1438: Dictionary key literal "type"
repeated
Remove repeated key literal "type"
(F601)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Python CI
- GitHub Check: Analyze (python)
🔇 Additional comments (10)
tests/unit_test/action/action_test.py (10)
1108-1139
: URL endpoint and response structure updated for pyscript action testing.The test now uses the new async callback action URL (
Utility.environment["async_callback_action"]["pyscript"]["url"]
) instead of the previous evaluator endpoint. The response structure has also been modified to follow the new format where:
- The payload is nested under
"body"
key instead of"data"
- A
"statusCode"
field is now included- The overall response structure better aligns with the asynchronous nature of the refactored endpoint
1185-1194
: Updated mocked response format to match new callback API structure.Consistent with the previous changes, the test now correctly mocks the response from the asynchronous callback endpoint with the standardized structure including:
- Success flag
- Response body with the result
- HTTP status code
1246-1255
: Harmonized response structure across multiple test cases.The response structure has been consistently updated across all test cases involving the pyscript execution to use the new format from the async callback endpoint. This ensures consistent testing throughout the codebase.
Also applies to: 1307-1316, 1370-1379, 1432-1441
1538-1539
: Updated error message assertion for the new response format.The expected error message is now updated to properly reflect the new format when a pyscript evaluation fails. This ensures the test correctly validates error handling in the new asynchronous system.
1571-1572
: Updated exception format in log assertion.The log assertion for the pyscript evaluation failure has been updated to work with the new response format, now correctly expecting the error information to be placed within the standardized response structure.
1625-1632
: Modified error handling test verification for asynchronous execution.The test now correctly retrieves and checks logs for failures in the asynchronous pyscript execution model, ensuring that connection errors are properly propagated and logged within the new system architecture.
1764-1765
: Updated URL and response structure for dynamic parameter testing.Multiple test cases involving dynamic parameters and their evaluation through pyscript have been updated to use the new async callback URL and response structure, ensuring all test scenarios correctly validate the new workflow.
Also applies to: 1916-1917, 1972-1974, 2149-2150
2433-2441
: Improved response handling for script evaluation.The code now properly handles the response when the bot response is set to None, adapting to the newer async callback response format with standardized status codes and nested body structure.
3855-3869
: Updated script evaluation endpoint and response handling.The code now uses the new asynchronous endpoint for script evaluation, with properly structured response expectations for successful executions.
3872-3911
: Improved error testing for script evaluation using monkeypatching.This test has been refactored to use monkeypatching instead of mocking HTTP responses for simulating script evaluation failures. The new approach provides more direct testing of the error handling logic with more precise control over the error conditions.
Utility.environment.setdefault("async_callback_action", {})["pyscript"] = { | ||
"url": "http://dummy-callback-url" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Global Utility.environment
mutation makes tests order-dependent
Each test directly mutates the process-wide Utility.environment
dict.
When this suite is executed in parallel (pytest-xdist) or if another test relies on the previous value, it will create flakiness.
Consider isolating the change with a fixture/monkeypatch so that the key is reset after the test:
-Utility.environment.setdefault("async_callback_action", {})["pyscript"] = {
- "url": "http://dummy-callback-url"
-}
+@pytest.fixture(autouse=True)
+def _patch_async_callback_env(monkeypatch):
+ orig = Utility.environment.get("async_callback_action", {}).copy()
+ monkeypatch.setitem(
+ Utility.environment.setdefault("async_callback_action", {}),
+ "pyscript",
+ {"url": "http://dummy-callback-url"},
+ )
+ yield
+ Utility.environment["async_callback_action"] = orig
Committable suggestion skipped: line range outside the PR's diff.
"body": { | ||
"bot_response": "Successfully Evaluated the pyscript", | ||
"slots": "invalid slots values", "type": "text", | ||
"type":"text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix duplicate dictionary key "type" in response object.
There is a duplicate "type"
key in the mock response JSON object which would cause the second value to override the first in actual execution.
Apply this diff to remove the duplicate key:
json={
"success": True,
"body": {
"bot_response": None,
"slots": {"param2": "param2value"},
"type":"data"
- },
- "statusCode":200
- },
+ },
+ "statusCode":200
+ },
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
1438-1438: Dictionary key literal "type"
repeated
Remove repeated key literal "type"
(F601)
msg["To"] = ",".join(email) | ||
msg.attach(body) | ||
smtp.sendmail(from_addr, email, msg.as_string()) | ||
smtp.quit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handle with try catch
…ng into training (digiteinfotech#1922) Co-authored-by: hghuge <hghuge@digite.com>
…igiteinfotech#1919) * - added new api for collection_data which supports timestamp filter * - added new api for collection_data which supports timestamp filter * - added new api for collection_data which supports timestamp filter --------- Co-authored-by: hghuge <hghuge@digite.com>
…est cases for the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🔭 Outside diff range comments (1)
kairon/async_callback/utils.py (1)
58-63
: 🛠️ Refactor suggestionAdd validation for bot ID when setting up predefined objects.
If
bot
is None in the predefined objects, the partial functions will silently accept it and fail later with confusing errors. Adding validation here would provide clearer error messages.bot = predefined_objects.get("bot") channel = predefined_objects.get("channel") sender_id = predefined_objects.get("sender_id") +if bot is None: + raise ValueError("predefined_objects must contain a non-empty 'bot' key") + predefined_objects['_getattr_'] = safer_getattr
♻️ Duplicate comments (4)
kairon/shared/pyscript/callback_pyscript_utils.py (2)
270-274
:⚠️ Potential issueSecurity risk: Returning raw AES key in
decrypt_request
Including the AES key in the return value could lead to accidental exposure of the symmetric key if the return value is logged or serialized elsewhere.
Protect the sensitive keys by using leading underscores to indicate they should not be exposed:
response_dict = { "decryptedBody": decrypted_data, - "aesKeyBuffer": aes_key, - "initialVectorBuffer": iv, + "_aes_key": aes_key, # leading underscore indicates internal use only + "_iv": iv, # leading underscore indicates internal use only }Then update all callers to use the new key names. This helps prevent accidental leakage of cryptographic material.
63-83
:⚠️ Potential issueTimezone handling in
add_schedule_job
may raise runtime exceptionsWhen a string timezone is passed to
astimezone()
, it will raise a ValueError as it expects a tzinfo object.Import pytz and safely handle string timezones:
-from apscheduler.util import obj_to_ref, astimezone +from apscheduler.util import obj_to_ref +import pytz +from tzlocal import get_localzone- next_run_time = trigger.get_next_fire_time(None, datetime.now(astimezone(timezone) or get_localzone())) + try: + tz = pytz.timezone(timezone) if isinstance(timezone, str) else timezone or get_localzone() + next_run_time = trigger.get_next_fire_time(None, datetime.now(tz)) + except Exception as e: + raise AppException(f"Invalid timezone: {e}")kairon/shared/pyscript/shared_pyscript_utils.py (1)
11-33
:⚠️ Potential issueAdd @staticmethod decorator to the fetch_collection_data method.
The method is defined with no
self
parameter but without@staticmethod
decorator, which will cause errors when called as a static method.class PyscriptSharedUtility: + @staticmethod def fetch_collection_data(query: dict): collection_data = CollectionData.objects(__raw__= query)
kairon/async_callback/utils.py (1)
114-153
: 🛠️ Refactor suggestionRefactor to reduce code duplication between handlers.
The
main_pyscript_handler
andpyscript_handler
methods are almost identical. Consider extracting a common private helper method that both can use to reduce duplication. Also, logging the entire output could potentially expose sensitive data.+@staticmethod +def _process_pyscript_event(event, executor_func): + output = { + "statusCode": 200, + "statusDescription": "200 OK", + "isBase64Encoded": False, + "headers": { + "Content-Type": "application/json; charset=utf-8" + }, + "body": None + } + data = event + if isinstance(data, list): + data = {item['name'].lower(): item['value'] for item in data} + try: + response = executor_func(data['source_code'], data.get('predefined_objects')) + output["body"] = response + except Exception as e: + logger.exception(e) + output["statusCode"] = 422 + output["body"] = str(e) + + # Log sanitized output to avoid exposing sensitive data + sanitized = {**output, "body": "REDACTED" if output["statusCode"] == 200 else output["body"]} + logger.debug(f"Pyscript handler response: {sanitized}") + return output @staticmethod def pyscript_handler(event, context): - output = { - "statusCode": 200, - "statusDescription": "200 OK", - "isBase64Encoded": False, - "headers": { - "Content-Type": "application/json; charset=utf-8" - }, - "body": None - } - data = event - if isinstance(data, list): - data = {item['name'].lower(): item['value'] for item in data} - try: - response = CallbackUtility.execute_script(data['source_code'], data.get('predefined_objects')) - output["body"] = response - except Exception as e: - logger.exception(e) - output["statusCode"] = 422 - output["body"] = str(e) - - logger.info(output) - return output + return CallbackUtility._process_pyscript_event(event, CallbackUtility.execute_script) @staticmethod def main_pyscript_handler(event, context): - output = { - "statusCode": 200, - "statusDescription": "200 OK", - "isBase64Encoded": False, - "headers": { - "Content-Type": "application/json; charset=utf-8" - }, - "body": None - } - data = event - if isinstance(data, list): - data = {item['name'].lower(): item['value'] for item in data} - try: - response = CallbackUtility.execute_main_pyscript(data['source_code'], data.get('predefined_objects')) - output["body"] = response - except Exception as e: - logger.exception(e) - output["statusCode"] = 422 - output["body"] = str(e) - - logger.info(output) - return output + return CallbackUtility._process_pyscript_event(event, CallbackUtility.execute_main_pyscript)
🧹 Nitpick comments (11)
kairon/shared/pyscript/callback_pyscript_utils.py (3)
154-179
: SMTP connection and error handling improvementsWhile the SMTP connection is properly closed in the finally block, error handling could be improved, and there's no validation of email addresses.
Consider these improvements:
@staticmethod def trigger_email( email: List[str], subject: str, body: str, smtp_url: str, smtp_port: int, sender_email: str, smtp_password: str, smtp_userid: str = None, tls: bool = False, content_type="html", ): """ This is a sync email trigger. Sends an email to the mail id of the recipient ... """ + # Validate email addresses + if not email or not all(isinstance(e, str) and '@' in e for e in email): + raise ValueError("Invalid email address(es)") + smtp = None try: smtp = SMTP(smtp_url, port=smtp_port, timeout=10) smtp.connect(smtp_url, smtp_port) if tls: smtp.starttls() smtp.login(smtp_userid if smtp_userid else sender_email, smtp_password) from_addr = sender_email mime_body = MIMEText(body, content_type) msg = MIMEMultipart("alternative") msg["Subject"] = subject msg["From"] = from_addr msg["To"] = ",".join(email) msg.attach(mime_body) smtp.sendmail(from_addr, email, msg.as_string()) except Exception as e: - print(f"Failed to send email: {e}") - raise + logger.error(f"Failed to send email: {e}") + raise AppException(f"Failed to send email: {e}") from e finally: if smtp: try: smtp.quit() except Exception as quit_error: - print(f"Failed to quit SMTP connection cleanly: {quit_error}") + logger.warning(f"Failed to quit SMTP connection cleanly: {quit_error}")
278-280
: Improve exception handling indecrypt_request
The current exception handling loses the original exception context, making debugging more difficult.
Use the
from
syntax to preserve the original exception:except Exception as e: - raise Exception(f"decryption failed-{str(e)}") + raise Exception(f"decryption failed-{str(e)}") from e🧰 Tools
🪛 Ruff (0.8.2)
279-279: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
302-304
: Improve exception handling inencrypt_response
Similar to the previous issue, this exception handling loses the original exception context.
Use the
from
syntax to preserve the original exception:except Exception as e: - raise Exception(f"encryption failed-{str(e)}") + raise Exception(f"encryption failed-{str(e)}") from e🧰 Tools
🪛 Ruff (0.8.2)
303-303: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
tests/integration_test/services_test.py (1)
4463-4487
: Well-structured test for the new filter functionality.The test properly validates the new collection data filtering endpoint with appropriate assertions. Good approach removing the
_id
field to handle dynamic values in the response comparison.Consider enhancing test coverage with additional test cases:
- Testing with multiple filter conditions
- Testing with non-existent filter values (empty result set)
- Testing with invalid filter formats to verify error handling
kairon/api/app/routers/bot/data.py (2)
284-299
: New collection filtering endpoint with timestamp support.The implementation enhances data querying capabilities, allowing clients to filter collection data by arbitrary JSON filters and timestamp ranges. This complements the existing endpoint and provides more flexibility.
A minor suggestion to improve function parameter consistency:
@router.get("/collection/{collection_name}/filter", response_model=Response) async def get_collection_data_with_timestamp( collection_name: str, - filters = Query(default='{}'), + filters: str = Query(default='{}'), start_time: str = Query(default=None), end_time: str = Query(default=None), current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS), ):This adds type annotation for the
filters
parameter, making it consistent with the other parameters.🧰 Tools
🪛 Ruff (0.8.2)
287-287: Do not perform function call
Query
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
290-290: Do not perform function call
Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
295-299
: Fix indentation in function arguments for readability.The indentation of the arguments in the
get_collection_data_with_timestamp
call is inconsistent.return {"data": list(cognition_processor.get_collection_data_with_timestamp(bot=current_user.get_bot(), - data_filter=filters, + data_filter=filters, collection_name=collection_name, - start_time=start_time, - end_time=end_time))} + start_time=start_time, + end_time=end_time))}tests/unit_test/data_processor/data_processor_test.py (1)
3519-3609
: Good test implementation with a few enhancement opportunities.This is a well-structured test with good coverage of filtering scenarios. I appreciate the thorough validation of the
get_collection_data_with_timestamp
method using multiple test cases.Some suggestions for improvement:
- Consider adding a cleanup step to remove the test data after the test completes to ensure test isolation.
- The test relies on real-time timestamps which could potentially make it flaky in very fast executions. Consider using fixed time differences instead of
datetime.utcnow()
.- The assertions verify data structure but don't explicitly verify that secure fields are properly handled (encrypted/decrypted). If that's within the scope of this test, consider adding specific assertions.
Here's a potential improvement for the cleanup:
# Add this at the end of the test def tearDown(self): # Clean up test data after test execution processor = CognitionDataProcessor() processor.delete_collection_data("user", "test_bot")kairon/shared/pyscript/shared_pyscript_utils.py (2)
1-9
: Remove unused imports to improve code cleanliness.Several imports are unused in this file, which can cause confusion and maintenance issues.
-from typing import Text, Dict, Callable, List +from typing import Text from typing import Text from loguru import logger from kairon import Utility from kairon.exceptions import AppException from kairon.shared.actions.utils import ActionUtility from kairon.shared.cognition.data_objects import CollectionData from kairon.api.app.routers.bot.data import CognitionDataProcessor🧰 Tools
🪛 Ruff (0.8.2)
1-1:
typing.Dict
imported but unusedRemove unused import
(F401)
1-1:
typing.Callable
imported but unusedRemove unused import
(F401)
1-1:
typing.List
imported but unusedRemove unused import
(F401)
2-2: Redefinition of unused
Text
from line 1Remove definition:
Text
(F811)
92-110
: Consider using try-except for HTTP request handling in delete_schedule_job.The method currently relies on checking the response status, but adding exception handling would make it more robust against network issues or timeouts.
@staticmethod def delete_schedule_job(event_id: Text, bot: Text): if not bot: raise AppException("Missing bot id") if not event_id: raise AppException("Missing event id") logger.info(f"event: {event_id}, bot: {bot}") event_server = Utility.environment['events']['server_url'] - http_response = ActionUtility.execute_http_request( - f"{event_server}/api/events/{event_id}", - "DELETE") - - if not http_response.get("success"): - raise AppException(http_response) - else: - logger.info(http_response) + try: + http_response = ActionUtility.execute_http_request( + f"{event_server}/api/events/{event_id}", + "DELETE") + + if not http_response.get("success"): + raise AppException(http_response) + else: + logger.info(http_response) + except Exception as e: + logger.error(f"Error deleting scheduled job: {e}") + raise AppException(f"Failed to delete scheduled job: {str(e)}")tests/unit_test/callback/pyscript_handler_test.py (2)
1433-1436
: Remove duplicate function definition.The
dummy_uuid7
function is already defined at line 1333. This duplicate definition is unnecessary.-def dummy_uuid7(): - class FakeUUID: - hex = "fixed_uuid" - return FakeUUID()🧰 Tools
🪛 Ruff (0.8.2)
1433-1433: Redefinition of unused
dummy_uuid7
from line 1333(F811)
1481-1481
: Unused variable assignment.The
result
variable is assigned but never used in the function.- result = CallbackScriptUtility.add_schedule_job(schedule_action, date_time, data, tz, _id=_id, bot=bot, kwargs=kwargs_in) + CallbackScriptUtility.add_schedule_job(schedule_action, date_time, data, tz, _id=_id, bot=bot, kwargs=kwargs_in)🧰 Tools
🪛 Ruff (0.8.2)
1481-1481: Local variable
result
is assigned to but never usedRemove assignment to unused variable
result
(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
kairon/api/app/routers/bot/data.py
(1 hunks)kairon/async_callback/router/pyscript_callback.py
(2 hunks)kairon/async_callback/utils.py
(4 hunks)kairon/shared/cognition/processor.py
(2 hunks)kairon/shared/concurrency/actors/pyscript_runner.py
(4 hunks)kairon/shared/concurrency/actors/utils.py
(2 hunks)kairon/shared/data/processor.py
(1 hunks)kairon/shared/pyscript/callback_pyscript_utils.py
(1 hunks)kairon/shared/pyscript/shared_pyscript_utils.py
(1 hunks)tests/integration_test/services_test.py
(1 hunks)tests/unit_test/actors/actors_test.py
(2 hunks)tests/unit_test/callback/pyscript_handler_test.py
(35 hunks)tests/unit_test/data_processor/data_processor_test.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- kairon/shared/concurrency/actors/pyscript_runner.py
🧰 Additional context used
🧬 Code Graph Analysis (4)
kairon/api/app/routers/bot/data.py (4)
kairon/shared/data/data_models.py (1)
Response
(127-131)kairon/shared/cognition/processor.py (1)
get_collection_data_with_timestamp
(288-323)kairon/shared/models.py (2)
User
(68-92)get_bot
(81-82)kairon/shared/auth.py (2)
Authentication
(30-465)get_current_user_and_bot
(83-109)
kairon/shared/cognition/processor.py (2)
kairon/api/app/routers/bot/data.py (1)
get_collection_data_with_timestamp
(285-299)kairon/shared/cognition/data_objects.py (1)
CollectionData
(89-123)
kairon/shared/data/processor.py (1)
kairon/importer/validator/file_validator.py (1)
validate_actions
(1036-1047)
kairon/async_callback/router/pyscript_callback.py (3)
kairon/async_callback/processor.py (2)
async_callback
(85-122)CallbackProcessor
(19-192)kairon/async_callback/utils.py (2)
CallbackUtility
(31-176)main_pyscript_handler
(130-152)kairon/shared/callback/data_objects.py (1)
PyscriptPayload
(441-446)
🪛 Ruff (0.8.2)
kairon/api/app/routers/bot/data.py
287-287: Do not perform function call Query
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
290-290: Do not perform function call Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
kairon/shared/pyscript/shared_pyscript_utils.py
1-1: typing.Dict
imported but unused
Remove unused import
(F401)
1-1: typing.Callable
imported but unused
Remove unused import
(F401)
1-1: typing.List
imported but unused
Remove unused import
(F401)
2-2: Redefinition of unused Text
from line 1
Remove definition: Text
(F811)
kairon/shared/pyscript/callback_pyscript_utils.py
279-279: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
303-303: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
333-333: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
tests/unit_test/callback/pyscript_handler_test.py
1433-1433: Redefinition of unused dummy_uuid7
from line 1333
(F811)
1481-1481: Local variable result
is assigned to but never used
Remove assignment to unused variable result
(F841)
kairon/async_callback/router/pyscript_callback.py
1-1: typing.Dict
imported but unused
Remove unused import
(F401)
1-1: typing.Any
imported but unused
Remove unused import
(F401)
3-3: Redefinition of unused Optional
from line 1
Remove definition: Optional
(F811)
4-4: Redefinition of unused Router
from line 2
Remove definition: Router
(F811)
4-4: Redefinition of unused Request
from line 2
Remove definition: Request
(F811)
4-4: Redefinition of unused BSResponse
from line 2
Remove definition: BSResponse
(F811)
4-4: Redefinition of unused TextContent
from line 2
Remove definition: TextContent
(F811)
4-4: blacksheep.TextContent
imported but unused
Remove unused import: blacksheep.TextContent
(F401)
4-4: Redefinition of unused json
from line 2
Remove definition: json
(F811)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (6)
kairon/shared/data/processor.py (1)
1484-1486
: Improved handling of validation actions in model trainingThe change ensures that all actions prefixed with "validate_" are included in the training data, regardless of whether they appear in story graphs. This is a beneficial enhancement that guarantees validation actions are always available during model training.
kairon/async_callback/router/pyscript_callback.py (1)
96-105
:✅ Verification successful
New endpoint for restricted Python script execution looks good.
This new endpoint provides a clean interface for executing restricted Python code with optional predefined objects, leveraging the existing
CallbackUtility.main_pyscript_handler
. The error handling is appropriate, returning a 500 status with the error message when exceptions occur.To ensure this endpoint integrates well with the existing code, verify that the
CallbackUtility.main_pyscript_handler
method properly sanitizes and executes the Python code:
🏁 Script executed:
#!/bin/bash # Check the implementation of main_pyscript_handler ast-grep --pattern $'def main_pyscript_handler(event, context)'Length of output: 64
It looks like
ast-grep
didn’t locate the handler. Let’s search the entire repo for its definition:
🏁 Script executed:
#!/bin/bash # Locate definition and references of main_pyscript_handler to inspect its implementation rg --color=never -n 'main_pyscript_handler' .Length of output: 1929
Let’s inspect the full implementation of
main_pyscript_handler
to confirm it properly sanitizes and safely executes the payload:
🏁 Script executed:
#!/bin/bash # Show definition and body of main_pyscript_handler for review rg -n -A200 'def main_pyscript_handler' kairon/async_callback/utils.pyLength of output: 2005
Let’s inspect the sandboxing logic in
execute_main_pyscript
to ensure it restricts unsafe operations:
🏁 Script executed:
#!/bin/bash # Show the implementation of execute_main_pyscript rg -n -A200 'def execute_main_pyscript' kairon/async_callback/utils.pyLength of output: 2538
Let’s examine the sandbox runner to confirm it enforces restrictions. We need to locate the
ActorOrchestrator
implementation:
🏁 Script executed:
#!/bin/bash # Find the definition and implementation of ActorOrchestrator rg -n -A100 'class ActorOrchestrator' .Length of output: 1187
To ensure the Python execution is safely sandboxed, let’s locate the actor that runs the script:
🏁 Script executed:
#!/bin/bash # Find ActorFactory to see how pyscript_runner is wired up rg -n -A50 'class ActorFactory' kairon/shared/concurrency # Search for where pyscript_runner is implemented rg -n -A50 'pyscript_runner' .Length of output: 34131
Let’s inspect the sandboxing logic in PyScriptRunner to confirm it enforces import restrictions and safe execution:
🏁 Script executed:
#!/bin/bash # Show PyScriptRunner implementation and its execute method rg -n -A200 'class PyScriptRunner' kairon/shared/concurrency/actors/pyscript_runner.py rg -n -A200 'def execute' kairon/shared/concurrency/actors/pyscript_runner.pyLength of output: 6474
We need to confirm that
compile_restricted
andglobal_safe
are properly imported and initialized for safe execution. Let’s inspect the top ofpyscript_runner.py
:
🏁 Script executed:
#!/bin/bash # Check imports and global_safe definition in PyScriptRunner rg -n -A5 'compile_restricted' kairon/shared/concurrency/actors/pyscript_runner.py rg -n -A5 'global_safe' kairon/shared/concurrency/actors/pyscript_runner.pyLength of output: 3067
Verified: Restricted execution is properly sandboxed and safe
I’ve confirmed that
main_pyscript_handler
routes throughexecute_main_pyscript
intoPyScriptRunner.execute
- Execution uses
RestrictedPython.compile_restricted
with a whitelistedglobal_safe
environment (no unauthorized imports or built-ins)- Timeout is enforced via
timeout_decorator
- Cleanup strips out modules and callables from the returned locals
- Existing unit tests cover import restrictions, API calls, database actions, timeouts, and error handling
No further changes required.
tests/unit_test/callback/pyscript_handler_test.py (2)
2064-2097
: Good test coverage for main_pyscript_handler.The test effectively verifies that the main_pyscript_handler correctly handles requests without bot context and returns the expected error response.
2100-2140
: Comprehensive test for successful operation with bot context.This test successfully verifies the full execution path of the main_pyscript_handler with a valid bot context, ensuring that it processes the delete operation correctly and returns the expected success response.
kairon/async_callback/utils.py (2)
66-82
: LGTM! Improved code modularity by delegating to specialized utility classes.The refactoring that moves functionality to specialized utility classes (
CallbackScriptUtility
andPyscriptSharedUtility
) improves code organization and maintainability. Using partial functions to maintain the context is a clean approach.
96-96
: Content-Type updated to JSON format.Changing the Content-Type header from "text/html" to "application/json" better reflects the actual content being returned.
@staticmethod | ||
def send_waba_message(payload: dict, key: Text, bot: str, predefined_objects: dict): | ||
waba_url = "https://waba-v2.360dialog.io/messages" | ||
headers = {"D360-API-KEY": key, "Content-TYpe": "application/json"} | ||
return requests.post(url=waba_url, headers=headers, data=orjson.dumps(payload)).json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect return value for send_waba_message
The method returns requests.post(...).json
which accesses the json
attribute instead of calling the json()
method to parse the response. Also, there's a typo in the header name.
Apply this diff to fix the issues:
@staticmethod
def send_waba_message(payload: dict, key: Text, bot: str, predefined_objects: dict):
waba_url = "https://waba-v2.360dialog.io/messages"
- headers = {"D360-API-KEY": key, "Content-TYpe": "application/json"}
- return requests.post(url=waba_url, headers=headers, data=orjson.dumps(payload)).json
+ headers = {"D360-API-KEY": key, "Content-Type": "application/json"}
+ return requests.post(url=waba_url, headers=headers, data=orjson.dumps(payload)).json()
Additionally, consider adding error handling for the HTTP request and removing the unused predefined_objects
parameter if it's not needed.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@staticmethod | |
def send_waba_message(payload: dict, key: Text, bot: str, predefined_objects: dict): | |
waba_url = "https://waba-v2.360dialog.io/messages" | |
headers = {"D360-API-KEY": key, "Content-TYpe": "application/json"} | |
return requests.post(url=waba_url, headers=headers, data=orjson.dumps(payload)).json | |
@staticmethod | |
def send_waba_message(payload: dict, key: Text, bot: str, predefined_objects: dict): | |
waba_url = "https://waba-v2.360dialog.io/messages" | |
headers = {"D360-API-KEY": key, "Content-Type": "application/json"} | |
return requests.post(url=waba_url, headers=headers, data=orjson.dumps(payload)).json() |
def test_send_waba_message_success(): | ||
payload = {"to": "12345", "type": "text", "text": {"body": "hello"}} | ||
api_key = "test_api_key" | ||
bot_id = "bot123" | ||
predefined = {"some": "object"} | ||
fake_response = MagicMock() | ||
fake_response.json = {"messages": [{"id": "abc123"}]} | ||
with patch("kairon.shared.concurrency.actors.utils.requests.post", return_value=fake_response) as mock_post: | ||
result = PyscriptUtility.send_waba_message(payload, api_key, bot_id, predefined) | ||
assert result == {"messages": [{"id": "abc123"}]} | ||
|
||
mock_post.assert_called_once_with( | ||
url="https://waba-v2.360dialog.io/messages", | ||
headers={"D360-API-KEY": api_key, "Content-TYpe": "application/json"}, | ||
data=orjson.dumps(payload) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test matches implementation but perpetuates the issue
The test correctly mocks the implementation behavior, but it's testing a flawed design where .json
is treated as an attribute rather than a method.
Update the test to match the correct implementation:
def test_send_waba_message_success():
payload = {"to": "12345", "type": "text", "text": {"body": "hello"}}
api_key = "test_api_key"
bot_id = "bot123"
predefined = {"some": "object"}
fake_response = MagicMock()
- fake_response.json = {"messages": [{"id": "abc123"}]}
+ fake_response.json.return_value = {"messages": [{"id": "abc123"}]}
with patch("kairon.shared.concurrency.actors.utils.requests.post", return_value=fake_response) as mock_post:
result = PyscriptUtility.send_waba_message(payload, api_key, bot_id, predefined)
assert result == {"messages": [{"id": "abc123"}]}
mock_post.assert_called_once_with(
url="https://waba-v2.360dialog.io/messages",
- headers={"D360-API-KEY": api_key, "Content-TYpe": "application/json"},
+ headers={"D360-API-KEY": api_key, "Content-Type": "application/json"},
data=orjson.dumps(payload)
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def test_send_waba_message_success(): | |
payload = {"to": "12345", "type": "text", "text": {"body": "hello"}} | |
api_key = "test_api_key" | |
bot_id = "bot123" | |
predefined = {"some": "object"} | |
fake_response = MagicMock() | |
fake_response.json = {"messages": [{"id": "abc123"}]} | |
with patch("kairon.shared.concurrency.actors.utils.requests.post", return_value=fake_response) as mock_post: | |
result = PyscriptUtility.send_waba_message(payload, api_key, bot_id, predefined) | |
assert result == {"messages": [{"id": "abc123"}]} | |
mock_post.assert_called_once_with( | |
url="https://waba-v2.360dialog.io/messages", | |
headers={"D360-API-KEY": api_key, "Content-TYpe": "application/json"}, | |
data=orjson.dumps(payload) | |
) | |
def test_send_waba_message_success(): | |
payload = {"to": "12345", "type": "text", "text": {"body": "hello"}} | |
api_key = "test_api_key" | |
bot_id = "bot123" | |
predefined = {"some": "object"} | |
fake_response = MagicMock() | |
- fake_response.json = {"messages": [{"id": "abc123"}]} | |
+ fake_response.json.return_value = {"messages": [{"id": "abc123"}]} | |
with patch("kairon.shared.concurrency.actors.utils.requests.post", return_value=fake_response) as mock_post: | |
result = PyscriptUtility.send_waba_message(payload, api_key, bot_id, predefined) | |
assert result == {"messages": [{"id": "abc123"}]} | |
mock_post.assert_called_once_with( | |
url="https://waba-v2.360dialog.io/messages", | |
- headers={"D360-API-KEY": api_key, "Content-TYpe": "application/json"}, | |
+ headers={"D360-API-KEY": api_key, "Content-Type": "application/json"}, | |
data=orjson.dumps(payload) | |
) |
start_time = kwargs.pop("start_time", None) | ||
end_time = kwargs.pop("end_time", None) | ||
data_filter = kwargs.pop("data_filter", {}) if isinstance(kwargs.get("data_filter"), dict) else json.loads( | ||
kwargs.pop("data_filter", "{}")) | ||
|
||
query = {"bot": bot, "collection_name": collection_name} | ||
if start_time: | ||
query["timestamp__gte"] = start_time | ||
if end_time: | ||
query["timestamp__lte"] = end_time | ||
|
||
query.update({ | ||
f"data__{key}": value for key, value in data_filter.items() if key and value | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for JSON parsing and timestamp validation
The method handles string-to-JSON conversion without proper error handling and doesn't validate that timestamp strings are in the correct format.
Add proper error handling and validation:
collection_name = collection_name.lower()
start_time = kwargs.pop("start_time", None)
end_time = kwargs.pop("end_time", None)
- data_filter = kwargs.pop("data_filter", {}) if isinstance(kwargs.get("data_filter"), dict) else json.loads(
- kwargs.pop("data_filter", "{}"))
+ data_filter = kwargs.pop("data_filter", {})
+ if not isinstance(data_filter, dict):
+ try:
+ data_filter = json.loads(data_filter or "{}")
+ except json.JSONDecodeError:
+ raise AppException("Invalid JSON format for data_filter")
query = {"bot": bot, "collection_name": collection_name}
if start_time:
+ try:
+ # Assuming ISO format for timestamps (adjust based on your actual format)
+ start_time = datetime.fromisoformat(start_time)
+ except ValueError:
+ raise AppException("Invalid format for start_time")
query["timestamp__gte"] = start_time
if end_time:
+ try:
+ # Assuming ISO format for timestamps (adjust based on your actual format)
+ end_time = datetime.fromisoformat(end_time)
+ except ValueError:
+ raise AppException("Invalid format for end_time")
query["timestamp__lte"] = end_time
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
start_time = kwargs.pop("start_time", None) | |
end_time = kwargs.pop("end_time", None) | |
data_filter = kwargs.pop("data_filter", {}) if isinstance(kwargs.get("data_filter"), dict) else json.loads( | |
kwargs.pop("data_filter", "{}")) | |
query = {"bot": bot, "collection_name": collection_name} | |
if start_time: | |
query["timestamp__gte"] = start_time | |
if end_time: | |
query["timestamp__lte"] = end_time | |
query.update({ | |
f"data__{key}": value for key, value in data_filter.items() if key and value | |
}) | |
collection_name = collection_name.lower() | |
start_time = kwargs.pop("start_time", None) | |
end_time = kwargs.pop("end_time", None) | |
- data_filter = kwargs.pop("data_filter", {}) if isinstance(kwargs.get("data_filter"), dict) else json.loads( | |
- kwargs.pop("data_filter", "{}")) | |
+ data_filter = kwargs.pop("data_filter", {}) | |
+ if not isinstance(data_filter, dict): | |
+ try: | |
+ data_filter = json.loads(data_filter or "{}") | |
+ except json.JSONDecodeError: | |
+ raise AppException("Invalid JSON format for data_filter") | |
query = {"bot": bot, "collection_name": collection_name} | |
if start_time: | |
+ try: | |
+ # Assuming ISO format for timestamps (adjust based on your actual format) | |
+ start_time = datetime.fromisoformat(start_time) | |
+ except ValueError: | |
+ raise AppException("Invalid format for start_time") | |
query["timestamp__gte"] = start_time | |
if end_time: | |
+ try: | |
+ # Assuming ISO format for timestamps (adjust based on your actual format) | |
+ end_time = datetime.fromisoformat(end_time) | |
+ except ValueError: | |
+ raise AppException("Invalid format for end_time") | |
query["timestamp__lte"] = end_time | |
query.update({ | |
f"data__{key}": value for key, value in data_filter.items() if key and value | |
}) |
@staticmethod | ||
def save_as_pdf(text: str, bot: str, sender_id:str): | ||
try: | ||
_, media_id = UserMedia.save_markdown_as_pdf( | ||
bot=bot, | ||
sender_id=sender_id, | ||
text=text, | ||
filepath="report.pdf" | ||
) | ||
return media_id | ||
except Exception as e: | ||
raise Exception(f"encryption failed-{str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix error message and exception handling in save_as_pdf
The exception message incorrectly says "encryption failed" but this method does PDF generation. Also, the exception handling doesn't preserve the original exception context.
Fix the error message and exception handling:
@staticmethod
def save_as_pdf(text: str, bot: str, sender_id:str):
try:
_, media_id = UserMedia.save_markdown_as_pdf(
bot=bot,
sender_id=sender_id,
text=text,
filepath="report.pdf"
)
return media_id
except Exception as e:
- raise Exception(f"encryption failed-{str(e)}")
+ raise Exception(f"PDF generation failed: {str(e)}") from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@staticmethod | |
def save_as_pdf(text: str, bot: str, sender_id:str): | |
try: | |
_, media_id = UserMedia.save_markdown_as_pdf( | |
bot=bot, | |
sender_id=sender_id, | |
text=text, | |
filepath="report.pdf" | |
) | |
return media_id | |
except Exception as e: | |
raise Exception(f"encryption failed-{str(e)}") | |
@staticmethod | |
def save_as_pdf(text: str, bot: str, sender_id:str): | |
try: | |
_, media_id = UserMedia.save_markdown_as_pdf( | |
bot=bot, | |
sender_id=sender_id, | |
text=text, | |
filepath="report.pdf" | |
) | |
return media_id | |
except Exception as e: | |
raise Exception(f"PDF generation failed: {str(e)}") from e |
🧰 Tools
🪛 Ruff (0.8.2)
333-333: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
from typing import Optional, Dict, Any | ||
from blacksheep import Router, Request, Response as BSResponse, TextContent, json | ||
from typing import Optional | ||
from blacksheep import Router, Request, Response as BSResponse, TextContent | ||
from blacksheep import Router, Request, Response as BSResponse, TextContent, json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Clean up duplicate and unused imports.
The imports at the beginning of the file contain duplications and unused items that should be cleaned up.
-from typing import Optional, Dict, Any
-from blacksheep import Router, Request, Response as BSResponse, TextContent, json
-from typing import Optional
-from blacksheep import Router, Request, Response as BSResponse, TextContent, json
+from typing import Optional
+from blacksheep import Router, Request, Response as BSResponse, json
from blacksheep.contents import JSONContent
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from typing import Optional, Dict, Any | |
from blacksheep import Router, Request, Response as BSResponse, TextContent, json | |
from typing import Optional | |
from blacksheep import Router, Request, Response as BSResponse, TextContent | |
from blacksheep import Router, Request, Response as BSResponse, TextContent, json | |
from typing import Optional | |
from blacksheep import Router, Request, Response as BSResponse, json | |
from blacksheep.contents import JSONContent |
🧰 Tools
🪛 Ruff (0.8.2)
1-1: typing.Dict
imported but unused
Remove unused import
(F401)
1-1: typing.Any
imported but unused
Remove unused import
(F401)
3-3: Redefinition of unused Optional
from line 1
Remove definition: Optional
(F811)
4-4: Redefinition of unused Router
from line 2
Remove definition: Router
(F811)
4-4: Redefinition of unused Request
from line 2
Remove definition: Request
(F811)
4-4: Redefinition of unused BSResponse
from line 2
Remove definition: BSResponse
(F811)
4-4: Redefinition of unused TextContent
from line 2
Remove definition: TextContent
(F811)
4-4: blacksheep.TextContent
imported but unused
Remove unused import: blacksheep.TextContent
(F401)
4-4: Redefinition of unused json
from line 2
Remove definition: json
(F811)
def execute_main_pyscript(source_code: Text, predefined_objects: Dict = None): | ||
logger.info(source_code) | ||
logger.info(predefined_objects) | ||
|
||
collection_id = cognition_processor.save_collection_data(payload, user, bot) | ||
return { | ||
"message": "Record saved!", | ||
"data": {"_id": collection_id} | ||
} | ||
if not predefined_objects: | ||
predefined_objects = {} | ||
|
||
@staticmethod | ||
def update_data(collection_id: str, user: str, payload: dict, bot: str = None): | ||
if not bot: | ||
raise Exception("Missing bot id") | ||
|
||
collection_id = cognition_processor.update_collection_data(collection_id, payload, user, bot) | ||
return { | ||
"message": "Record updated!", | ||
"data": {"_id": collection_id} | ||
} | ||
script_variables = ActorOrchestrator.run( | ||
ActorType.pyscript_runner.value, source_code=source_code, timeout=60, | ||
predefined_objects=predefined_objects | ||
) | ||
return script_variables | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider adding essential predefined objects to execute_main_pyscript.
The execute_main_pyscript
method doesn't add any predefined utility functions, unlike execute_script
. This inconsistency might cause scripts to fail when executed through the main_pyscript_handler.
Are the predefined objects intentionally omitted from execute_main_pyscript
, or should they be present for consistency? If they should be available, consider adding them or documenting the difference.
🏁 Script executed:
#!/bin/bash
# Check whether any script files reference these omitted utility functions that
# would fail when called through main_pyscript_handler
echo "Searching for references to utility functions in pyscript files..."
rg "add_schedule_job|delete_schedule_job|send_email|add_data|get_data|delete_data|update_data|generate_id|datetime_to_utc_timestamp|decrypt_request|encrypt_response|create_callback|save_as_pdf" --type py
Length of output: 31778
🏁 Script executed:
#!/bin/bash
# Find line numbers of the function definitions
main_line=$(grep -n "def execute_main_pyscript" kairon/async_callback/utils.py | cut -d: -f1)
script_line=$(grep -n "def execute_script" kairon/async_callback/utils.py | cut -d: -f1)
echo "=== execute_main_pyscript around line $main_line ==="
sed -n "$((main_line-3)),$((main_line+15))p" kairon/async_callback/utils.py
echo
echo "=== execute_script around line $script_line ==="
sed -n "$((script_line-3)),$((script_line+25))p" kairon/async_callback/utils.py
Length of output: 2706
🏁 Script executed:
#!/bin/bash
# Print lines 115-200 to inspect main_pyscript_handler
sed -n '115,200p' kairon/async_callback/utils.py
Length of output: 2258
🏁 Script executed:
#!/bin/bash
# Show file header to find class name and imports
sed -n '1,60p' kairon/async_callback/utils.py
# Show where CallbackUtility is imported or defined
grep -R "class .*Callback" -n kairon/async_callback/utils.py
Length of output: 2309
Add missing predefined objects in execute_main_pyscript
for consistency
Unlike execute_script
, execute_main_pyscript
never injects the standard pyscript utilities, so any script that calls e.g. add_schedule_job
, get_data
or uses requests
, json
, etc., will fail under the main handler. You should replicate the injection block from execute_script
into execute_main_pyscript
before calling ActorOrchestrator.run
.
Locations to update:
- File:
kairon/async_callback/utils.py
- Method:
execute_main_pyscript
(around line 115)
Example diff:
def execute_main_pyscript(source_code: Text, predefined_objects: Dict = None):
logger.info(source_code)
logger.info(predefined_objects)
if not predefined_objects:
predefined_objects = {}
+ # --- begin pyscript utils injection (same as execute_script) ---
+ bot = predefined_objects.get("bot")
+ predefined_objects['_getattr_'] = safer_getattr
+ predefined_objects['requests'] = requests
+ predefined_objects['json'] = json
+ predefined_objects['datetime'] = datetime
+ predefined_objects['add_schedule_job'] = partial(CallbackScriptUtility.add_schedule_job, bot=bot)
+ predefined_objects['delete_schedule_job'] = partial(PyscriptSharedUtility.delete_schedule_job, bot=bot)
+ predefined_objects['send_email'] = partial(CallbackScriptUtility.send_email, bot=bot)
+ predefined_objects['add_data'] = partial(PyscriptSharedUtility.add_data, bot=bot)
+ predefined_objects['get_data'] = partial(PyscriptSharedUtility.get_data, bot=bot)
+ predefined_objects['delete_data'] = partial(PyscriptSharedUtility.delete_data, bot=bot)
+ predefined_objects['update_data'] = partial(PyscriptSharedUtility.update_data, bot=bot)
+ predefined_objects['generate_id'] = CallbackScriptUtility.generate_id
+ predefined_objects['datetime_to_utc_timestamp'] = CallbackScriptUtility.datetime_to_utc_timestamp
+ predefined_objects['decrypt_request'] = CallbackScriptUtility.decrypt_request
+ predefined_objects['encrypt_response'] = CallbackScriptUtility.encrypt_response
+ predefined_objects['create_callback'] = partial(CallbackScriptUtility.create_callback, bot=bot)
+ predefined_objects['save_as_pdf'] = partial(CallbackScriptUtility.save_as_pdf, bot=bot)
+ # --- end pyscript utils injection ---
script_variables = ActorOrchestrator.run(
ActorType.pyscript_runner.value, source_code=source_code, timeout=60,
predefined_objects=predefined_objects
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def execute_main_pyscript(source_code: Text, predefined_objects: Dict = None): | |
logger.info(source_code) | |
logger.info(predefined_objects) | |
collection_id = cognition_processor.save_collection_data(payload, user, bot) | |
return { | |
"message": "Record saved!", | |
"data": {"_id": collection_id} | |
} | |
if not predefined_objects: | |
predefined_objects = {} | |
@staticmethod | |
def update_data(collection_id: str, user: str, payload: dict, bot: str = None): | |
if not bot: | |
raise Exception("Missing bot id") | |
collection_id = cognition_processor.update_collection_data(collection_id, payload, user, bot) | |
return { | |
"message": "Record updated!", | |
"data": {"_id": collection_id} | |
} | |
script_variables = ActorOrchestrator.run( | |
ActorType.pyscript_runner.value, source_code=source_code, timeout=60, | |
predefined_objects=predefined_objects | |
) | |
return script_variables | |
def execute_main_pyscript(source_code: Text, predefined_objects: Dict = None): | |
logger.info(source_code) | |
logger.info(predefined_objects) | |
if not predefined_objects: | |
predefined_objects = {} | |
# --- begin pyscript utils injection (same as execute_script) --- | |
bot = predefined_objects.get("bot") | |
predefined_objects['_getattr_'] = safer_getattr | |
predefined_objects['requests'] = requests | |
predefined_objects['json'] = json | |
predefined_objects['datetime'] = datetime | |
predefined_objects['add_schedule_job'] = partial(CallbackScriptUtility.add_schedule_job, bot=bot) | |
predefined_objects['delete_schedule_job'] = partial(PyscriptSharedUtility.delete_schedule_job, bot=bot) | |
predefined_objects['send_email'] = partial(CallbackScriptUtility.send_email, bot=bot) | |
predefined_objects['add_data'] = partial(PyscriptSharedUtility.add_data, bot=bot) | |
predefined_objects['get_data'] = partial(PyscriptSharedUtility.get_data, bot=bot) | |
predefined_objects['delete_data'] = partial(PyscriptSharedUtility.delete_data, bot=bot) | |
predefined_objects['update_data'] = partial(PyscriptSharedUtility.update_data, bot=bot) | |
predefined_objects['generate_id'] = CallbackScriptUtility.generate_id | |
predefined_objects['datetime_to_utc_timestamp'] = CallbackScriptUtility.datetime_to_utc_timestamp | |
predefined_objects['decrypt_request'] = CallbackScriptUtility.decrypt_request | |
predefined_objects['encrypt_response'] = CallbackScriptUtility.encrypt_response | |
predefined_objects['create_callback'] = partial(CallbackScriptUtility.create_callback, bot=bot) | |
predefined_objects['save_as_pdf'] = partial(CallbackScriptUtility.save_as_pdf, bot=bot) | |
# --- end pyscript utils injection --- | |
script_variables = ActorOrchestrator.run( | |
ActorType.pyscript_runner.value, | |
source_code=source_code, | |
timeout=60, | |
predefined_objects=predefined_objects | |
) | |
return script_variables |
…est cases for the same
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Tests