-
Notifications
You must be signed in to change notification settings - Fork 84
Developed an end-to-end catalog ingestion flow from Petpooja(POS) to Meta #1932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…Meta and added corresponding test cases
WalkthroughThis update introduces a comprehensive catalog synchronization and POS integration framework. It adds new API endpoints for catalog sync and POS integration configuration, implements provider-specific sync logic (with Petpooja as an initial provider), and integrates catalog data with both knowledge vault (AI) and Meta (Facebook) catalog systems. The change includes new data models, logging mechanisms, provider mapping configurations, and email notification templates for sync status. Extensive validation, error handling, and batch processing are implemented for catalog data operations. The update is accompanied by robust unit and integration test coverage for all major workflows. Changes
Sequence Diagram(s)sequenceDiagram
participant Provider as Catalog Provider (e.g., Petpooja)
participant API as Kairon API
participant Event as CatalogSync Event
participant Log as CatalogSyncLogProcessor
participant AI as CognitionDataProcessor
participant Meta as MetaProcessor
participant Mail as MailUtility
Provider->>API: POST /api/bot/integration/{provider}/{sync_type}/{bot}/{token}
API->>Event: Validate payload and sync status
Event->>Log: Log sync initiation
Event->>Event: Preprocess data (provider-specific)
Event->>AI: Validate and upsert data (if AI enabled)
Event->>Meta: Sync with Meta catalog (if meta enabled)
Event->>Log: Log sync completion/failure
Event->>Mail: Send sync status email
API-->>Provider: Respond with sync in progress message
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 28
🔭 Outside diff range comments (2)
kairon/shared/data/data_objects.py (1)
1106-1120
: 🛠️ Refactor suggestion
BotSyncConfig
lacks validation ensuring at least one of AI or Meta sync is enabled.An empty config will silently pass and block downstream checks. Consider:
def validate(self, clean=True): if clean: self.clean() if not self.ai_enabled and not self.meta_enabled: raise ValidationError("Either ai_enabled or meta_enabled must be True")kairon/shared/cognition/processor.py (1)
523-539
:⚠️ Potential issueFix column-validation for
item_toggle
– current logic rejects perfectly valid payloadsWhen
sync_type == VaultSyncType.item_toggle.name
we should allow any subset of columns besides the primary key and we must not fail on column-length mismatch.
However, the current code still:
- Performs the
column_length_mismatch
check (lines 524-529).- Uses
set(row.keys()).issubset(column_dict.keys())
, which does not explicitly guarantee the presence of the primary key.As a result, legitimate partial-update rows will be reported as invalid.
@@ - if "column_length_mismatch" in event_validations: - if len(row.keys()) != len(column_dict.keys()): - row_errors.append({ - "status": "Column length mismatch", - "expected_columns": list(column_dict.keys()), - "actual_columns": list(row.keys()) - }) + # In item-toggle we expect a sparse update – skip strict length check + if "column_length_mismatch" in event_validations and sync_type != VaultSyncType.item_toggle.name: + if len(row.keys()) != len(column_dict.keys()): + row_errors.append({ + "status": "Column length mismatch", + "expected_columns": list(column_dict.keys()), + "actual_columns": list(row.keys()) + }) @@ - if not set(row.keys()).issubset(set(column_dict.keys())): + valid_columns = {primary_key_col} | set(column_dict.keys()) + if not set(row.keys()).issubset(valid_columns) or primary_key_col not in row: row_errors.append({ "status": "Invalid columns in input data", - "expected_columns": expected_columns, + "expected_columns": expected_columns, "actual_columns": list(row.keys()) })🧰 Tools
🪛 Ruff (0.8.2)
523-524: Use a single
if
statement instead of nestedif
statements(SIM102)
533-533: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
♻️ Duplicate comments (1)
kairon/shared/catalog_sync/catalog_sync_log_processor.py (1)
76-85
: 🛠️ Refactor suggestionSame
MultipleObjectsReturned
risk inis_sync_in_progress
Replicate the defensive pattern suggested for
add_log
; otherwise the “already in progress” guard itself can explode.
🧹 Nitpick comments (31)
kairon/shared/utils.py (1)
2670-2683
: Email handler method for catalog sync status notifications correctly implemented.The
__handle_catalog_sync_status
method follows the same pattern as other email handlers in theMailUtility
class, replacing template placeholders with actual values from the kwargs.Consider adding type hints to the method parameters for consistency with other methods in the class:
- def __handle_catalog_sync_status(**kwargs): + def __handle_catalog_sync_status(**kwargs) -> tuple[str, str]:kairon/catalog_sync/definitions/base.py (1)
4-18
: Abstract base class provides a clear interface for implementing catalog sync providers.The class establishes a standard contract with three required methods: validate, preprocess, and execute. This ensures consistency across different catalog sync implementations.
Consider making the error message more specific to indicate which method is not implemented:
- def validate(self): - raise NotImplementedError("Provider not implemented") + def validate(self): + raise NotImplementedError("validate() method not implemented for this provider") - def preprocess(self): - raise NotImplementedError("Provider not implemented") + def preprocess(self): + raise NotImplementedError("preprocess() method not implemented for this provider") - def execute(self, **kwargs): - raise NotImplementedError("Provider not implemented") + def execute(self, **kwargs): + raise NotImplementedError("execute() method not implemented for this provider")kairon/catalog_sync/definitions/factory.py (1)
6-29
: Clean factory implementation with proper error handling.The factory provides a clean way to instantiate the correct catalog sync implementation based on provider name.
Improve exception chaining for better debugging by using
from
syntax:try: provider_enum = CatalogSyncClass(provider.lower()) except ValueError as err: valid_syncs = [sync.value for sync in CatalogSyncClass] - raise AppException(f"'{provider}' is not a valid catalog sync provider. Accepted types: {valid_syncs}") + raise AppException(f"'{provider}' is not a valid catalog sync provider. Accepted types: {valid_syncs}") from err🧰 Tools
🪛 Ruff (0.8.2)
23-23: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
kairon/shared/data/constant.py (1)
114-130
: Consider relationship with existing EVENT_STATUS enumThe new
SYNC_STATUS
enum has several values that overlap with the existingEVENT_STATUS
enum (lines 93-111), such as "INITIATED", "VALIDATING", "SAVE", "COMPLETED", "FAILED", and "ABORTED". Consider whether these could be consolidated or if there's a specific reason to maintain them as separate enumerations.If there's a strong reason to keep them separate, consider adding a class docstring to both enums explaining their distinct purposes. If they could be consolidated, consider refactoring to use a single shared event status enum with additional catalog-specific statuses as needed.
metadata/catalog_provider_mappings.json (2)
25-31
: Consider making default URL and brand configurable per implementationThe default URL ("https://www.kairon.com/") and brand ("Sattva") values are hardcoded. Consider making these configurable per implementation or bot instance, as they may not be appropriate for all users of the system.
This could be implemented by adding bot-specific configuration options in the database or allowing these values to be specified during integration setup.
50-53
: Make facebook_product_category configurable based on business typeThe default value for
facebook_product_category
is set to "Food and drink > General", which is appropriate for restaurants but might not suit other business types. Consider making this configurable based on the business category.You could implement a mapping of business types to appropriate Facebook product categories, or allow this to be specified during integration setup.
kairon/shared/catalog_sync/data_objects.py (1)
3-4
: Remove unused importsThe imports
BooleanField
andListField
are declared but not used in the file.-from mongoengine import StringField, BooleanField, DateTimeField, DynamicDocument, DictField, ListField +from mongoengine import StringField, DateTimeField, DynamicDocument, DictField🧰 Tools
🪛 Ruff (0.8.2)
3-3:
mongoengine.BooleanField
imported but unusedRemove unused import
(F401)
3-3:
mongoengine.ListField
imported but unusedRemove unused import
(F401)
kairon/api/app/routers/bot/data.py (3)
2-2
: Remove unused import.The
Text
import from typing is not being used in this file.-from typing import List, Text +from typing import List🧰 Tools
🪛 Ruff (0.8.2)
2-2:
typing.Text
imported but unusedRemove unused import:
typing.Text
(F401)
4-4
: Remove unused import.The
Path
import from fastapi is not being used in this file.-from fastapi import UploadFile, File, Security, APIRouter, Query, HTTPException, Path +from fastapi import UploadFile, File, Security, APIRouter, Query, HTTPException🧰 Tools
🪛 Ruff (0.8.2)
4-4:
fastapi.Path
imported but unusedRemove unused import:
fastapi.Path
(F401)
374-395
: Well-implemented POS integration endpoint.This new endpoint for adding POS integration configurations properly validates the provider against the allowed catalog sync classes, loads provider mappings, and returns the integration endpoint URL.
A few minor improvements to consider:
- Add more descriptive docstring beyond "Add data integration config"
- Avoid using Security function in argument defaults
@router.post("/integrations/add", response_model=Response) async def add_pos_integration_config( request_data: POSIntegrationRequest, sync_type: str, - current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS), + current_user: User = Depends(lambda: Authentication.get_current_user_and_bot(scopes=DESIGNER_ACCESS)), ): """ - Add data integration config + Add Point of Sale (POS) integration configuration. + + This endpoint validates the provider, adds bot sync configuration, + and saves the POS integration configuration. + + Parameters: + request_data: POS integration request data + sync_type: Type of synchronization + current_user: Authenticated user with bot access + + Returns: + Integration endpoint URL """🧰 Tools
🪛 Ruff (0.8.2)
379-379: Do not perform function call
Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
template/emails/catalog_sync_status.html (1)
115-115
: Update preheader text for catalog sync emails.The preheader text (hidden text shown in email previews) is not specific to catalog sync status emails and appears to be a generic welcome message.
- We're thrilled to have you on kAIron! Get ready to dive into your first bot. + Catalog synchronization status update for your kAIron bot.kairon/api/app/routers/bot/integrations.py (1)
10-15
: Remove unused imports & variables to silence Ruff F401 and cut dead code
MailUtility
(l. 15) and thecognition_processor
instance (l. 18) are never referenced in this module.
Please delete them (or use them) to avoid lint failures and keep the router lean.-from kairon.shared.utils import MailUtility - -cognition_processor = CognitionDataProcessor()Also applies to: 17-18
🧰 Tools
🪛 Ruff (0.8.2)
15-15:
kairon.shared.utils.MailUtility
imported but unusedRemove unused import:
kairon.shared.utils.MailUtility
(F401)
tests/unit_test/data_processor/catalog_sync_log_processor_test.py (2)
7-15
: Prune unused imports to keep the test suite tidy
uni_escape_re
,AppException
, andCognitionDataProcessor
are never used and trigger Ruff F401.-from babel.messages.jslexer import uni_escape_re -from kairon.exceptions import AppException -from kairon.shared.cognition.processor import CognitionDataProcessor🧰 Tools
🪛 Ruff (0.8.2)
7-7:
babel.messages.jslexer.uni_escape_re
imported but unusedRemove unused import:
babel.messages.jslexer.uni_escape_re
(F401)
11-11:
kairon.exceptions.AppException
imported but unusedRemove unused import:
kairon.exceptions.AppException
(F401)
15-15:
kairon.shared.cognition.processor.CognitionDataProcessor
imported but unusedRemove unused import:
kairon.shared.cognition.processor.CognitionDataProcessor
(F401)
24-27
: Environment variable should be upper-snake-caseRuff SIM112 warns on
system_file
; rename toSYSTEM_FILE
to follow convention and avoid conflicts.-os.environ["system_file"] = "./tests/testing_data/system.yaml" +os.environ["SYSTEM_FILE"] = "./tests/testing_data/system.yaml"🧰 Tools
🪛 Ruff (0.8.2)
24-24: Use capitalized environment variable
SYSTEM_FILE
instead ofsystem_file
Replace
system_file
withSYSTEM_FILE
(SIM112)
kairon/meta/processor.py (3)
4-10
: Remove unused imports to keep the module clean and avoid Ruff F401.
urljoin
,Utility
, andAioRestClient
are never referenced.-from urllib.parse import urljoin -from kairon import Utility -from kairon.shared.rest_client import AioRestClient🧰 Tools
🪛 Ruff (0.8.2)
4-4:
urllib.parse.urljoin
imported but unusedRemove unused import:
urllib.parse.urljoin
(F401)
7-7:
kairon.Utility
imported but unusedRemove unused import:
kairon.Utility
(F401)
9-9:
kairon.shared.rest_client.AioRestClient
imported but unusedRemove unused import:
kairon.shared.rest_client.AioRestClient
(F401)
59-60
: Shadowing built-inid
in list-comprehension.Not breaking, but avoid overriding built-ins for readability:
-return [{"retailer_id": id, "method": "DELETE"} for id in remaining_ids] +return [{"retailer_id": rid, "method": "DELETE"} for rid in remaining_ids]
115-119
: Replace
loguru
is already imported; use it for consistency and traceability.-print("Response JSON:", response.json()) -print("Successfully deleted data from meta.") +logger.info("Meta delete response: {}", response.json())kairon/shared/data/data_objects.py (1)
931-935
:catalog_sync_limit_per_day
added – update validation or docs to reflect the new quota.There is no validation ensuring the value is positive/non-zero; consider:
if self.catalog_sync_limit_per_day <= 0: raise ValidationError("catalog_sync_limit_per_day must be positive")kairon/events/definitions/petpooja_sync.py (3)
1-14
: Prune unused imports (Ruff F401).
set_key
,Utility
,EventClass
,BotSyncConfig
, andAppException
aren’t used.-from dotenv import set_key -from kairon import Utility -from kairon.exceptions import AppException -from kairon.shared.constants import EventClass -from kairon.shared.data.data_objects import BotSyncConfig🧰 Tools
🪛 Ruff (0.8.2)
3-3:
dotenv.set_key
imported but unusedRemove unused import:
dotenv.set_key
(F401)
5-5:
kairon.Utility
imported but unusedRemove unused import:
kairon.Utility
(F401)
6-6:
loguru.logger
imported but unusedRemove unused import:
loguru.logger
(F401)
9-9:
kairon.exceptions.AppException
imported but unusedRemove unused import:
kairon.exceptions.AppException
(F401)
12-12:
kairon.shared.constants.EventClass
imported but unusedRemove unused import:
kairon.shared.constants.EventClass
(F401)
14-14:
kairon.shared.data.data_objects.BotSyncConfig
imported but unusedRemove unused import:
kairon.shared.data.data_objects.BotSyncConfig
(F401)
166-173
: Blocking HTTP calls inside the event loop – wrapMetaProcessor
methods withawait
but ensure they are non-blocking.
MetaProcessor.*_meta_catalog
usesasyncio.to_thread
, which is fine, but if the processor itself grows, consider making these calls through an aiohttp client instead of threads.
175-181
: Use logger instead ofKeeps log aggregation consistent.
-print(str(e)) +logger.exception(str(e))tests/integration_test/services_test.py (3)
1-4
: Remove unused std-lib imports
os
,re
, andshutil
are imported but never referenced in this module. Unused imports slow down import time and add cognitive overhead.-import asyncio -import os -import re -import shutil +import asyncio
3808-3811
:token
assigned but never used – clean up or use
Multiple tests storetoken = sync_url.split(str(pytest.bot) + "/")[1]
but never reference it afterwards, triggering Ruff F841.Either remove the assignment:
- token = sync_url.split(str(pytest.bot) + "/")[1]
or use
token
in subsequent assertions to validate the rerun URL construction.Also applies to: 3936-3938, 4570-4573
🧰 Tools
🪛 Ruff (0.8.2)
3810-3810: Local variable
token
is assigned to but never usedRemove assignment to unused variable
token
(F841)
3157-5039
: High duplication – consider parameterised fixtures
Nearly every catalog-sync test block repeats the same setup:
- Saving the same
LLMSecret
objects- Creating identical Petpooja payloads
- Posting
/integrations/add
and extractingsync_url
- Mocking the same methods
This inflates the test file (~5 k lines) and slows execution. Using
@pytest.fixture def petpooja_setup(...): ... yield sync_urland
@pytest.mark.parametrize
over(sync_type, ai_enabled, meta_enabled, delete_data, images_mode, expect_failure, …)
would reduce code duplication dramatically and make maintenance easier.🧰 Tools
🪛 Ruff (0.8.2)
3810-3810: Local variable
token
is assigned to but never usedRemove assignment to unused variable
token
(F841)
3937-3937: Local variable
token
is assigned to but never usedRemove assignment to unused variable
token
(F841)
4572-4572: Local variable
token
is assigned to but never usedRemove assignment to unused variable
token
(F841)
kairon/shared/catalog_sync/catalog_sync_log_processor.py (2)
1-17
: Remove unused imports to satisfy Ruff and avoid dead code
json
,List
,ColumnMetadata
,ContentValidationLogs
, andMongoProcessor
are brought in but never used. Ruff (F401) already flags these; leaving them will break CI or at least keep warnings around.-import json -from typing import List -... -from kairon.shared.cognition.data_objects import CognitionSchema, ColumnMetadata, CollectionData +from kairon.shared.cognition.data_objects import CognitionSchema, CollectionData -from kairon.shared.content_importer.data_objects import ContentValidationLogs -... -from kairon.shared.data.processor import MongoProcessorCleaning them out keeps the module lean and the import graph accurate.
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
json
imported but unusedRemove unused import:
json
(F401)
3-3:
typing.List
imported but unusedRemove unused import:
typing.List
(F401)
9-9:
kairon.shared.cognition.data_objects.ColumnMetadata
imported but unusedRemove unused import:
kairon.shared.cognition.data_objects.ColumnMetadata
(F401)
11-11:
kairon.shared.content_importer.data_objects.ContentValidationLogs
imported but unusedRemove unused import:
kairon.shared.content_importer.data_objects.ContentValidationLogs
(F401)
15-15:
kairon.shared.data.processor.MongoProcessor
imported but unusedRemove unused import:
kairon.shared.data.processor.MongoProcessor
(F401)
309-311
: Remove extraneousf
prefixLine 310:
f"Global fallback image URL not found"
has no interpolation.- raise Exception( - f"Global fallback image URL not found") + raise Exception("Global fallback image URL not found")Silences Ruff F541.
🧰 Tools
🪛 Ruff (0.8.2)
311-311: f-string without any placeholders
Remove extraneous
f
prefix(F541)
tests/unit_test/data_processor/data_processor_test.py (2)
12-12
: Remove unused import.The
urllib.parse.urljoin
import is not used in this file.- from urllib.parse import urljoin
🧰 Tools
🪛 Ruff (0.8.2)
12-12:
urllib.parse.urljoin
imported but unusedRemove unused import:
urllib.parse.urljoin
(F401)
1287-1413
: Consider using pytest fixtures for image resolution tests.The image resolution tests have duplicated setup and teardown code. Consider using pytest fixtures to reduce code duplication and ensure cleanup even if tests fail.
@pytest.fixture def setup_bot_sync_and_images(request): bot = "test_bot" user = "test_user" bot_sync_config = BotSyncConfig( parent_bot="parent_bot", restaurant_name="TestRestaurant", provider="some_provider", branch_name="TestBranch", branch_bot=bot, ai_enabled=True, meta_enabled=True, user=user, timestamp=datetime.utcnow() ) bot_sync_config.save() restaurant_name, branch_name = CognitionDataProcessor.get_restaurant_and_branch_name(bot) catalog_images_collection = f"{restaurant_name}_{branch_name}_catalog_images" def teardown(): BotSyncConfig.objects.delete() CollectionData.objects(collection_name=catalog_images_collection).delete() request.addfinalizer(teardown) return bot, user, catalog_images_collectionkairon/shared/cognition/processor.py (3)
880-895
: Bulk write defaults to ordered=True – switch to unordered for large batchesWith
ordered=True
a single bad document aborts the whole batch and hurts throughput. Unless ordering is required, useordered=False
.- CognitionData._get_collection().bulk_write(update_operations) + CognitionData._get_collection().bulk_write(update_operations, ordered=False)
919-921
: Typo in error message – “fro” ➜ “from”- await llm_processor.__delete_collection_points__(qdrant_collection, vector_ids, "Cannot delete stale points fro Qdrant!") + await llm_processor.__delete_collection_points__(qdrant_collection, vector_ids, "Cannot delete stale points from Qdrant!")
977-1013
: Relative path for metadata file is fragile in deploymentsHard-coding
"./metadata/catalog_provider_mappings.json"
fails when the working directory differs (e.g., gunicorn, cron). Preferimportlib.resources
orpkg_resources
to load package data relative to module path.from importlib import resources with resources.files("kairon.shared") / "metadata" / "catalog_provider_mappings.json" as fp: mapping_data = json.load(fp.open())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (35)
email.yaml
(1 hunks)kairon/api/app/main.py
(2 hunks)kairon/api/app/routers/bot/data.py
(5 hunks)kairon/api/app/routers/bot/integrations.py
(1 hunks)kairon/catalog_sync/definitions/base.py
(1 hunks)kairon/catalog_sync/definitions/factory.py
(1 hunks)kairon/events/definitions/catalog_sync.py
(1 hunks)kairon/events/definitions/factory.py
(2 hunks)kairon/events/definitions/petpooja_sync.py
(1 hunks)kairon/meta/processor.py
(1 hunks)kairon/shared/account/data_objects.py
(1 hunks)kairon/shared/account/processor.py
(2 hunks)kairon/shared/catalog_sync/catalog_sync_log_processor.py
(1 hunks)kairon/shared/catalog_sync/data_objects.py
(1 hunks)kairon/shared/cognition/processor.py
(5 hunks)kairon/shared/constants.py
(2 hunks)kairon/shared/data/constant.py
(3 hunks)kairon/shared/data/data_models.py
(1 hunks)kairon/shared/data/data_objects.py
(2 hunks)kairon/shared/data/utils.py
(1 hunks)kairon/shared/llm/processor.py
(1 hunks)kairon/shared/models.py
(1 hunks)kairon/shared/utils.py
(2 hunks)metadata/catalog_provider_mappings.json
(1 hunks)template/emails/catalog_sync_status.html
(1 hunks)tests/integration_test/services_test.py
(23 hunks)tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload.json
(1 hunks)tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload_invalid_missing_instock.json
(1 hunks)tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload_invalid_missing_itemid.json
(1 hunks)tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload_invalid_nonboolean_instock.json
(1 hunks)tests/testing_data/catalog_sync/catalog_sync_push_menu_payload.json
(1 hunks)tests/testing_data/catalog_sync/catalog_sync_push_menu_payload_invalid.json
(1 hunks)tests/testing_data/catalog_sync/catalog_sync_push_menu_payload_with_delete_data.json
(1 hunks)tests/unit_test/data_processor/catalog_sync_log_processor_test.py
(1 hunks)tests/unit_test/data_processor/data_processor_test.py
(6 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (6)
kairon/shared/data/utils.py (3)
kairon/shared/auth.py (2)
Authentication
(30-465)generate_integration_token
(282-315)kairon/shared/data/constant.py (2)
ACCESS_ROLES
(177-184)TOKEN_TYPE
(200-206)kairon/shared/utils.py (1)
Utility
(92-2277)
kairon/shared/account/processor.py (1)
kairon/shared/utils.py (1)
Utility
(92-2277)
kairon/shared/llm/processor.py (1)
kairon/shared/rest_client.py (3)
AioRestClient
(26-141)request
(18-20)request
(51-69)
kairon/shared/catalog_sync/data_objects.py (2)
kairon/shared/utils.py (1)
push_notification
(1514-1534)kairon/shared/data/signals.py (2)
auditlogger
(91-94)apply
(16-18)
kairon/meta/processor.py (2)
kairon/shared/catalog_sync/data_objects.py (1)
CatalogProviderMapping
(29-35)kairon/shared/rest_client.py (1)
AioRestClient
(26-141)
tests/integration_test/services_test.py (6)
kairon/meta/processor.py (2)
MetaProcessor
(13-122)update_meta_catalog
(82-101)kairon/shared/catalog_sync/data_objects.py (2)
CatalogProviderMapping
(29-35)CatalogSyncLogs
(9-24)kairon/shared/cognition/processor.py (2)
CognitionDataProcessor
(26-1041)get_restaurant_and_branch_name
(1035-1041)kairon/shared/data/processor.py (1)
MongoProcessor
(179-9012)kairon/events/definitions/catalog_sync.py (1)
CatalogSync
(13-73)kairon/api/app/routers/bot/integrations.py (1)
sync_data
(21-49)
🪛 Ruff (0.8.2)
kairon/shared/account/processor.py
962-962: Use a context manager for opening files
(SIM115)
kairon/catalog_sync/definitions/factory.py
23-23: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
tests/unit_test/data_processor/catalog_sync_log_processor_test.py
7-7: babel.messages.jslexer.uni_escape_re
imported but unused
Remove unused import: babel.messages.jslexer.uni_escape_re
(F401)
11-11: kairon.exceptions.AppException
imported but unused
Remove unused import: kairon.exceptions.AppException
(F401)
15-15: kairon.shared.cognition.processor.CognitionDataProcessor
imported but unused
Remove unused import: kairon.shared.cognition.processor.CognitionDataProcessor
(F401)
24-24: Use capitalized environment variable SYSTEM_FILE
instead of system_file
Replace system_file
with SYSTEM_FILE
(SIM112)
116-116: pytest.raises(Exception)
should be considered evil
(B017)
129-129: Do not use bare except
(E722)
132-132: pytest.raises(Exception)
should be considered evil
(B017)
235-235: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
246-246: pytest.raises(Exception)
should be considered evil
(B017)
kairon/shared/llm/processor.py
344-345: Use a single if
statement instead of nested if
statements
Combine if
statements using and
(SIM102)
kairon/shared/catalog_sync/data_objects.py
3-3: mongoengine.BooleanField
imported but unused
Remove unused import
(F401)
3-3: mongoengine.ListField
imported but unused
Remove unused import
(F401)
kairon/api/app/routers/bot/data.py
2-2: typing.Text
imported but unused
Remove unused import: typing.Text
(F401)
4-4: fastapi.Path
imported but unused
Remove unused import: fastapi.Path
(F401)
379-379: Do not perform function call Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
kairon/meta/processor.py
4-4: urllib.parse.urljoin
imported but unused
Remove unused import: urllib.parse.urljoin
(F401)
7-7: kairon.Utility
imported but unused
Remove unused import: kairon.Utility
(F401)
9-9: kairon.shared.rest_client.AioRestClient
imported but unused
Remove unused import: kairon.shared.rest_client.AioRestClient
(F401)
kairon/api/app/routers/bot/integrations.py
15-15: kairon.shared.utils.MailUtility
imported but unused
Remove unused import: kairon.shared.utils.MailUtility
(F401)
23-24: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
25-25: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
26-26: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
27-27: Do not perform function call Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
55-56: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
57-57: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
58-58: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
59-59: Do not perform function call Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
kairon/shared/catalog_sync/catalog_sync_log_processor.py
1-1: json
imported but unused
Remove unused import: json
(F401)
3-3: typing.List
imported but unused
Remove unused import: typing.List
(F401)
9-9: kairon.shared.cognition.data_objects.ColumnMetadata
imported but unused
Remove unused import: kairon.shared.cognition.data_objects.ColumnMetadata
(F401)
11-11: kairon.shared.content_importer.data_objects.ContentValidationLogs
imported but unused
Remove unused import: kairon.shared.content_importer.data_objects.ContentValidationLogs
(F401)
15-15: kairon.shared.data.processor.MongoProcessor
imported but unused
Remove unused import: kairon.shared.data.processor.MongoProcessor
(F401)
311-311: f-string without any placeholders
Remove extraneous f
prefix
(F541)
kairon/events/definitions/petpooja_sync.py
3-3: dotenv.set_key
imported but unused
Remove unused import: dotenv.set_key
(F401)
5-5: kairon.Utility
imported but unused
Remove unused import: kairon.Utility
(F401)
6-6: loguru.logger
imported but unused
Remove unused import: loguru.logger
(F401)
9-9: kairon.exceptions.AppException
imported but unused
Remove unused import: kairon.exceptions.AppException
(F401)
12-12: kairon.shared.constants.EventClass
imported but unused
Remove unused import: kairon.shared.constants.EventClass
(F401)
14-14: kairon.shared.data.data_objects.BotSyncConfig
imported but unused
Remove unused import: kairon.shared.data.data_objects.BotSyncConfig
(F401)
tests/integration_test/services_test.py
82-82: kairon.shared.data.utils.DataUtility
imported but unused
Remove unused import: kairon.shared.data.utils.DataUtility
(F401)
84-84: kairon.shared.models.StoryEventType
imported but unused
Remove unused import: kairon.shared.models.StoryEventType
(F401)
3810-3810: Local variable token
is assigned to but never used
Remove assignment to unused variable token
(F841)
3937-3937: Local variable token
is assigned to but never used
Remove assignment to unused variable token
(F841)
4572-4572: Local variable token
is assigned to but never used
Remove assignment to unused variable token
(F841)
kairon/shared/cognition/processor.py
668-668: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
671-671: Use key not in dict
instead of key not in dict.keys()
Remove .keys()
(SIM118)
tests/unit_test/data_processor/data_processor_test.py
12-12: urllib.parse.urljoin
imported but unused
Remove unused import: urllib.parse.urljoin
(F401)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Analyze (python)
- GitHub Check: Python CI
🔇 Additional comments (39)
kairon/shared/account/data_objects.py (1)
158-158
: New mail template field added correctlyThe addition of
catalog_sync_status
field to theMailTemplates
embedded document class is appropriate and follows the existing pattern for other email templates. This field will support the new catalog synchronization status email notifications.tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload_invalid_missing_instock.json (1)
1-13
: Well-structured test data for validation testingThis test payload correctly models an invalid catalog sync item toggle request missing the
inStock
field. The data structure includes all other required fields (restID, itemID array, type, etc.) needed to thoroughly test validation failures in the catalog sync API.email.yaml (1)
28-28
: Email template configuration added correctlyThe addition of the
catalog_sync_status_subject
configuration with a proper default value ("Catalog Sync Update") follows the existing pattern for other email templates. This configuration supports the catalog synchronization notification system.kairon/api/app/main.py (2)
26-26
: New integration module import looks goodThe import of the
integrations
router fromkairon.api.app.routers.bot
is correctly added to support the new catalog synchronization API endpoints.
272-272
: API router properly registeredThe integration router is correctly included with the appropriate prefix
/api/bot/integration
and tagged as "Data Integrations", following the established pattern for other API endpoints.kairon/shared/models.py (1)
120-122
: Enum renamed to better reflect its purposeThe renaming from
VaultSyncEventType
toVaultSyncType
andfield_update
toitem_toggle
improves semantic clarity, making the code more maintainable and better aligned with the catalog synchronization feature. This change enhances readability while preserving the existing functionality.kairon/shared/data/data_models.py (1)
1394-1397
: Well-structured Meta catalog configuration model.The
MetaConfig
model provides a clean interface for Meta catalog API authentication and targeting. The model appropriately defines both the access token and catalog ID as string fields, which aligns with Meta's API requirements.tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload.json (1)
1-14
: Valid catalog item toggle test payload.The test payload provides a complete example for testing the catalog sync item toggle functionality with all required fields:
- Restaurant ID
- Item availability status
- Item ID array
- Item type
- Custom auto turn-on time configuration
This will properly test the item toggle feature of the catalog synchronization.
tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload_invalid_missing_itemid.json (1)
1-11
: Appropriate negative test case for validation.This test payload correctly omits the required
itemID
field to test validation logic. It's a good practice to include negative test cases to ensure proper error handling.kairon/events/definitions/factory.py (2)
2-2
: New import for catalog sync functionality.Adding the import for
CatalogSync
is appropriate for the catalog integration feature.
28-29
: Factory pattern extension for catalog sync.The factory pattern is correctly extended to support the new catalog integration event type. The mapping between
EventClass.catalog_integration
andCatalogSync
enables the event system to instantiate the appropriate handler for catalog synchronization events.tests/testing_data/catalog_sync/catalog_sync_item_toggle_payload_invalid_nonboolean_instock.json (1)
1-14
: Test payload correctly structured for negative validation testing.This test payload properly defines an invalid value for the
inStock
field (string "invalid" instead of a boolean) which will help validate the system's ability to properly handle and reject incorrectly formatted catalog sync requests.kairon/shared/constants.py (3)
86-86
: LGTM! Appropriate catalog integration event type added.The new
catalog_integration
value is correctly added to theEventClass
enum to support the new catalog synchronization feature.
88-89
: LGTM! Provider-specific catalog sync class added.The
CatalogSyncClass
enum is appropriately defined with the Petpooja provider as the initial implementation, maintaining consistency with other enum definitions in the codebase.
127-128
: LGTM! Catalog provider enum added consistently.The
CatalogProvider
enum is correctly defined with thePETPOOJA
member, following the existing pattern of provider-specific enumerations.kairon/shared/account/processor.py (1)
1017-1018
: LGTM! Email template correctly registered.The catalog sync status email template is properly assigned to the email configuration dictionary following the established pattern.
kairon/shared/data/utils.py (1)
400-415
: LGTM: Clear implementation for generating integration endpoints.The implementation follows a similar pattern to the existing
get_channel_endpoint
method, adapting it for integration endpoints with appropriate access control.kairon/shared/data/constant.py (2)
206-206
: LGTM: DATA_INTEGRATION token type additionThe addition of
DATA_INTEGRATION
to theTOKEN_TYPE
enum is appropriate for the new catalog synchronization feature.
283-286
: LGTM: SyncType enum for catalog eventsThe new
SyncType
enum with values forpush_menu
anditem_toggle
properly defines the types of catalog synchronization events.tests/testing_data/catalog_sync/catalog_sync_push_menu_payload_invalid.json (2)
159-185
: Confirm intentional missing "itemid" fieldUnlike other item objects in the JSON (such as the one at line 131), this item is missing an "itemid" field. If this is intentional to test invalid data handling, consider adding a comment to make this explicit. If not, add the missing field.
This appears to be specifically crafted for testing validation logic that catches missing required fields. If this is the case, consider documenting this intent to help other developers understand the purpose of this test data.
1-269
: LGTM: Comprehensive test data for validation scenariosThe test data provides a thorough representation of a restaurant menu structure with multiple entities (restaurants, categories, items, etc.) making it well-suited for testing complex validation scenarios.
kairon/shared/catalog_sync/data_objects.py (1)
29-36
: LGTM: CatalogProviderMapping implementationThe
CatalogProviderMapping
class is well-structured with appropriate fields and a clear docstring explaining its purpose.kairon/api/app/routers/bot/data.py (1)
346-346
: Parameter renamed for consistency.The parameter has been renamed from
event_type
tosync_type
, which aligns with the broader refactoring of event types to sync types across the codebase.Also applies to: 355-355, 365-365
template/emails/catalog_sync_status.html (2)
1-2
: Good use of documentation comments.The template clearly documents the required parameters and subject line, which helps with maintainability.
159-163
: Well-structured email content section.The email message body clearly presents the key information (Bot ID, Execution ID, Sync Status, Message) with proper formatting and styling.
tests/testing_data/catalog_sync/catalog_sync_push_menu_payload.json (1)
1-270
: Well-structured test data for catalog sync.This test data provides a complete representation of a restaurant's menu structure with three menu items, compared to the two items in the "with_delete_data" version. The data includes all necessary fields for testing catalog synchronization.
One observation is that the prices in this test data are quite high (8700.00, 3426.00, 3159.00) which seems unusual for restaurant items. While this doesn't affect functionality, you might want to use more realistic values in test data.
kairon/api/app/routers/bot/integrations.py (1)
36-50
: Return yourResponse
model, not a plain dictBoth handlers declare
response_model=Response
but return a dict.
IfResponse
expects additional fields (success
,data
, …) FastAPI will coerce or fail silently.
Consider returningResponse(message=..., success=True)
(or whatever schema applies) to make the contract explicit.Also applies to: 72-84
kairon/meta/processor.py (1)
38-44
: Hard-coding"currency": "INR"
in CREATE but omitting it in UPDATE can break Meta validation.Pass currency consistently or rely on Meta defaults. Verify provider mapping instead of embedding a constant.
kairon/events/definitions/petpooja_sync.py (2)
25-34
: Enum default may be wrong type – use.value
or the enum itself consistently.-self.sync_type = kwargs.get("sync_type", SyncType.item_toggle) +# Accept both string and enum; default to enum member +self.sync_type = kwargs.get("sync_type", SyncType.item_toggle.value)Ensure subsequent comparisons (
self.sync_type == SyncType.push_menu
) use the same representation.
154-160
:⚠️ Potential issueIncorrect membership test on
Document
– will raiseTypeError
.
'meta_config' in integrations_doc
checks attribute names, not fields. Use attribute access.-if integrations_doc and 'meta_config' in integrations_doc: +if integrations_doc and integrations_doc.meta_config:Likely an incorrect or invalid review comment.
tests/integration_test/services_test.py (1)
31-41
: Prune unused internal imports flagged by Ruff
DataUtility
andStoryEventType
(imported a bit lower, L84) aren’t used anywhere in this file. Please delete them or reference them to silence F401.-from kairon.shared.data.utils import DataUtility -from kairon.shared.models import StoryEventTypeThe rest of the added imports look relevant to the new tests.
tests/unit_test/data_processor/data_processor_test.py (5)
1096-1223
: Good test implementation for menu preprocessing.The test comprehensively verifies the catalog preprocessing functionality with proper setup and teardown.
1224-1233
: LGTM: Good error handling test.This test properly validates that the system handles missing provider mappings correctly.
1446-1487
: Excellent preservation of existing configuration.This test correctly verifies that when adding a configuration for a bot that already has one, the existing configuration values are preserved rather than overwritten.
1488-1517
: Good normalization of restaurant and branch names.The test verifies that restaurant and branch names are properly normalized (lowercase with underscores) for consistent collection naming.
2684-2691
: Update of validation method name matches refactored API.The test has been properly updated to use
_validate_sync_type
instead of_validate_event_type
and to referenceVaultSyncType
instead ofVaultSyncEventType
.kairon/shared/cognition/processor.py (3)
658-668
: Preserve original traceback when re-raising exceptionsRaising a new
AppException
withoutfrom e
loses the original stack trace, making debugging harder and triggering Ruff B904.- except Exception as e: - raise AppException(f"Failed to sync document with Qdrant: {str(e)}") + except Exception as e: + # keep the causal chain for easier debugging + raise AppException(f"Failed to sync document with Qdrant: {str(e)}") from e[ suggest_nitpick ]
🧰 Tools
🪛 Ruff (0.8.2)
668-668: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
670-673
: Remove unnecessary.keys()
lookup
__members__
already behaves like a mapping; dropping.keys()
is both clearer and satisfies Ruff SIM118.- if sync_type not in VaultSyncType.__members__.keys(): + if sync_type not in VaultSyncType.__members__: raise AppException("Sync type does not exist")[ suggest_nitpick ]
🧰 Tools
🪛 Ruff (0.8.2)
671-671: Use
key not in dict
instead ofkey not in dict.keys()
Remove
.keys()
(SIM118)
777-803
: Use domain-specificAppException
instead of bareException
resolve_image_link
currently throws a generic exception (lines 788-803). Uniform exception types simplify upstream handling and API responses.- else: - raise Exception(f"Image URL not found for {item_id} in {catalog_images_collection}") + else: + raise AppException(f"Image URL not found for {item_id} in {catalog_images_collection}")[ suggest_nitpick ]
provider: str | ||
config: dict | ||
meta_config: Optional[MetaConfig] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify configuration validation for POS providers.
The POSIntegrationRequest
model looks well-structured with appropriate fields for provider integration. However, there's no validation on the provider
field to ensure it's a supported provider type.
Consider adding validation to ensure the provider is supported:
+@validator("provider")
+def validate_provider(cls, v, values, **kwargs):
+ from kairon.shared.utils import Utility
+
+ # Assuming you have a list of supported providers
+ supported_providers = ["petpooja", "other_provider"] # Update with actual providers
+ if v not in supported_providers:
+ raise ValueError(f"Unsupported provider. Supported providers: {supported_providers}")
+ return v
Let's check if there are supported providers defined:
🏁 Script executed:
#!/bin/bash
# Search for supported POS providers in the codebase
rg -t py "supported_providers|SUPPORTED_PROVIDERS|provider.*validation" --no-filename
Length of output: 201
I couldn't find an existing list of supported POS providers in the repo—let’s check if Utility
already exposes anything for this before adding a new constant.
🏁 Script executed:
#!/bin/bash
# Locate the Utility class source
UTILITY_FILE=$(rg -l 'class Utility' -t py)
echo "Utility file: $UTILITY_FILE"
# Search for any provider- or supported-related methods/attributes
rg -n 'supported|provider|pos' "$UTILITY_FILE"
Length of output: 1024
Let’s inspect the business_providers
block in Utility
to see if a list of supported POS providers is already defined:
🏁 Script executed:
#!/bin/bash
# Show lines around where business_providers is referenced
sed -n '1570,1620p' kairon/shared/utils.py
Length of output: 2057
Let’s locate where system_metadata
is initialized in Utility
to see if POS providers are defined there:
🏁 Script executed:
#!/bin/bash
# Show the top of the Utility class (including system_metadata)
sed -n '1,200p' kairon/shared/utils.py
Length of output: 6642
Let’s locate where Utility.system_metadata
gets populated and look for any POS provider lists or metadata files:
#!/bin/bash
# Find where system_metadata is assigned or loaded
rg -n 'Utility\.system_metadata' -t py
# Search for ConfigLoader usage to load any metadata
rg -n 'ConfigLoader' -t py
# Look for any YAML/JSON files that mention POS or providers
rg -l '\.ya?ml$|\.json$' -t yml -t json | xargs rg -n 'pos|provider'
To locate any existing provider lists or metadata definitions, let’s search where system_metadata
is populated and inspect YAML/JSON configs:
🏁 Script executed:
#!/bin/bash
# 1. Find where system_metadata is assigned or loaded
rg -n 'system_metadata\s*=' -t py
# 2. See where ConfigLoader is used (likely to load metadata files)
rg -n 'ConfigLoader' -t py
# 3. Scan all YAML/JSON files for “pos” or “provider” entries
fd -e yml -e yaml -e json . | xargs rg -n 'pos|provider'
Length of output: 62328
Let’s inspect the POS provider definitions in the integrations metadata:
🏁 Script executed:
#!/bin/bash
# Show lines around the business_providers section in integrations metadata
rg -n 'business_providers' -C5 metadata/integrations.yml
Length of output: 332
Add provider validation to POSIntegrationRequest
The POSIntegrationRequest
model currently accepts any string for provider
, which may allow unsupported integrations. Since there’s no existing list of POS providers in the codebase, please:
-
Define a constant list of supported POS providers (for example, in a new
kairon/shared/constants.py
or at the top ofdata_models.py
):# kairon/shared/constants.py SUPPORTED_POS_PROVIDERS = ["petpooja", "other_provider"] # update with real values
-
Add a Pydantic validator to enforce this in
kairon/shared/data/data_models.py
(around lines 1398–1401):class POSIntegrationRequest(BaseModel): provider: str config: dict meta_config: Optional[MetaConfig] + @validator("provider") + def validate_provider(cls, v): + from kairon.shared.constants import SUPPORTED_POS_PROVIDERS + + if v not in SUPPORTED_POS_PROVIDERS: + raise ValueError( + f"Unsupported provider '{v}'. Supported providers: {SUPPORTED_POS_PROVIDERS}" + ) + return v
This ensures only known POS integrations are accepted at the model level.
@@ -959,6 +959,7 @@ def load_system_properties(): | |||
).read(), | |||
button_template=open("template/emails/button.html", "r").read(), | |||
leave_bot_owner_notification=open("template/emails/leaveBotOwnerNotification.html", "r").read(), | |||
catalog_sync_status=open("template/emails/catalog_sync_status.html", "r").read(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use context manager for file operations.
File operations should use context managers (with statement) to ensure proper resource cleanup even if exceptions occur.
- catalog_sync_status=open("template/emails/catalog_sync_status.html", "r").read(),
+ catalog_sync_status=open("template/emails/catalog_sync_status.html", "r").read(),
Apply this pattern to all file operations for consistency:
with open("template/emails/catalog_sync_status.html", "r") as f:
catalog_sync_status = f.read()
🧰 Tools
🪛 Ruff (0.8.2)
962-962: Use a context manager for opening files
(SIM115)
async def __delete_collection_points__(self, collection_name: Text, point_ids: List, err_msg: Text, | ||
raise_err=True): | ||
client = AioRestClient() | ||
response = await client.request(http_url=urljoin(self.db_url, f"/collections/{collection_name}/points/delete"), | ||
request_method="POST", | ||
headers=self.headers, | ||
request_body={"points": point_ids}, | ||
return_json=True, | ||
timeout=5) | ||
print(response) | ||
|
||
if not response.get('result'): | ||
if "status" in response: | ||
logging.exception(response['status'].get('error')) | ||
if raise_err: | ||
raise AppException(err_msg) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove debug print statement.
The implementation for deleting collection points follows the error handling pattern used elsewhere, but includes a debug print statement.
Apply this diff to remove the debug print and simplify conditional logic:
async def __delete_collection_points__(self, collection_name: Text, point_ids: List, err_msg: Text,
raise_err=True):
client = AioRestClient()
response = await client.request(http_url=urljoin(self.db_url, f"/collections/{collection_name}/points/delete"),
request_method="POST",
headers=self.headers,
request_body={"points": point_ids},
return_json=True,
timeout=5)
- print(response)
- if not response.get('result'):
- if "status" in response:
+ if not response.get('result') and "status" in response:
logging.exception(response['status'].get('error'))
if raise_err:
raise AppException(err_msg)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async def __delete_collection_points__(self, collection_name: Text, point_ids: List, err_msg: Text, | |
raise_err=True): | |
client = AioRestClient() | |
response = await client.request(http_url=urljoin(self.db_url, f"/collections/{collection_name}/points/delete"), | |
request_method="POST", | |
headers=self.headers, | |
request_body={"points": point_ids}, | |
return_json=True, | |
timeout=5) | |
print(response) | |
if not response.get('result'): | |
if "status" in response: | |
logging.exception(response['status'].get('error')) | |
if raise_err: | |
raise AppException(err_msg) | |
async def __delete_collection_points__(self, collection_name: Text, point_ids: List, err_msg: Text, | |
raise_err=True): | |
client = AioRestClient() | |
response = await client.request( | |
http_url=urljoin(self.db_url, f"/collections/{collection_name}/points/delete"), | |
request_method="POST", | |
headers=self.headers, | |
request_body={"points": point_ids}, | |
return_json=True, | |
timeout=5, | |
) | |
if not response.get("result") and "status" in response: | |
logging.exception(response["status"].get("error")) | |
if raise_err: | |
raise AppException(err_msg) |
🧰 Tools
🪛 Ruff (0.8.2)
344-345: Use a single if
statement instead of nested if
statements
Combine if
statements using and
(SIM102)
"default": "https://www.kairon.com/default-image.jpg" | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify default image URL exists and is accessible
The default image URL "https://www.kairon.com/default-image.jpg" may not exist or be publicly accessible. This could cause issues in production if the source is missing and the default is used.
🏁 Script executed:
#!/bin/bash
# Check if the default image URL returns a valid response
curl -s -I "https://www.kairon.com/default-image.jpg" | head -n 1
Length of output: 88
Default image URL is returning 404 – please update or host a valid default image
The configured default image URL (https://www.kairon.com/default-image.jpg
) currently returns HTTP 404 and will result in broken images wherever it’s used. To resolve this:
• Confirm the correct URL (or path) for your intended default image.
• Host the image at a publicly accessible location (e.g., your CDN or asset server).
• Update metadata/catalog_provider_mappings.json
(lines 22–23) with the working URL.
Once replaced, verify again:
curl -s -I "https://www.kairon.com/default-image.jpg" | head -n1
# Expect: HTTP/1.1 200 OK
class CatalogSyncLogs(DynamicDocument): | ||
execution_id = StringField(required=True, unique=True) | ||
raw_payload = DictField(required=True) | ||
processed_payload = DictField(default=None) | ||
validation_errors = DictField(default={}) | ||
exception = StringField(default="") | ||
bot = StringField(required=True) | ||
user = StringField(required=True) | ||
provider = StringField(required=True) | ||
sync_type = StringField(required=True) | ||
start_timestamp = DateTimeField(default=datetime.utcnow) | ||
end_timestamp = DateTimeField(default=None) | ||
status = StringField(default=None) | ||
sync_status = StringField(default="COMPLETED") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add docstring and clarify status fields in CatalogSyncLogs
The CatalogSyncLogs
class lacks a docstring explaining its purpose, and it has two potentially redundant fields: status
and sync_status
.
- Add a class docstring similar to the one in
CatalogProviderMapping
- Clarify the distinction between
status
andsync_status
either through field comments or by renaming one of them to better indicate its purpose
Example docstring:
"""
Stores logs and status information for catalog synchronization operations.
"""
try: | ||
doc = CatalogSyncLogs.objects(bot=bot).filter( | ||
Q(sync_status__ne=SYNC_STATUS.COMPLETED.value) & | ||
Q(sync_status__ne=SYNC_STATUS.FAILED.value)).get() | ||
except DoesNotExist: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
add_log
may raise MultipleObjectsReturned
and ignores ABORTED
status
get()
expects exactly one match; if two “in-progress” logs exist, it will raise MultipleObjectsReturned
and crash.
Also, a sync in ABORTED
state is considered “open” here but not in is_sync_in_progress
, creating inconsistent behaviour.
- doc = CatalogSyncLogs.objects(bot=bot).filter(
- Q(sync_status__ne=SYNC_STATUS.COMPLETED.value) &
- Q(sync_status__ne=SYNC_STATUS.FAILED.value)).get()
+ doc = (
+ CatalogSyncLogs.objects(bot=bot)
+ .filter(
+ Q(sync_status__ne=SYNC_STATUS.COMPLETED.value)
+ & Q(sync_status__ne=SYNC_STATUS.FAILED.value)
+ & Q(sync_status__ne=SYNC_STATUS.ABORTED.value)
+ )
+ .order_by("-start_timestamp")
+ .first()
+ )
first()
makes the call tolerant to duplicates while still picking the latest.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
try: | |
doc = CatalogSyncLogs.objects(bot=bot).filter( | |
Q(sync_status__ne=SYNC_STATUS.COMPLETED.value) & | |
Q(sync_status__ne=SYNC_STATUS.FAILED.value)).get() | |
except DoesNotExist: | |
try: | |
doc = ( | |
CatalogSyncLogs.objects(bot=bot) | |
.filter( | |
Q(sync_status__ne=SYNC_STATUS.COMPLETED.value) | |
& Q(sync_status__ne=SYNC_STATUS.FAILED.value) | |
& Q(sync_status__ne=SYNC_STATUS.ABORTED.value) | |
) | |
.order_by("-start_timestamp") | |
.first() | |
) | |
except DoesNotExist: |
kairon/shared/cognition/processor.py
Outdated
@staticmethod | ||
def save_ai_data(processed_data: dict, bot: str, user: str, sync_type: str): | ||
""" | ||
Syncs a document with Qdrant vector database by generating embeddings and upserting them. | ||
Save each item in `kv` of the processed payload into CollectionData individually. | ||
""" | ||
restaurant_name, branch_name = CognitionDataProcessor.get_restaurant_and_branch_name(bot) | ||
catalog_data_collection = f"{restaurant_name}_{branch_name}_catalog_data" | ||
|
||
kv_items = processed_data.get("kv", []) | ||
incoming_data_map = {item["id"]: item for item in kv_items} | ||
incoming_ids = set(incoming_data_map.keys()) | ||
|
||
existing_docs = CollectionData.objects( | ||
collection_name=catalog_data_collection, | ||
bot=bot, | ||
status=True | ||
) | ||
existing_data_map = {doc.data.get("id"): doc for doc in existing_docs} | ||
existing_ids = set(existing_data_map.keys()) | ||
|
||
for item_id, item in incoming_data_map.items(): | ||
if item_id in existing_data_map: | ||
doc = existing_data_map[item_id] | ||
if sync_type == SyncType.item_toggle: | ||
for key, value in item.items(): | ||
doc.data[key] = value | ||
else: | ||
doc.data = item | ||
doc.timestamp = datetime.utcnow() | ||
doc.user = user | ||
doc.save() | ||
else: | ||
CollectionData( | ||
collection_name=catalog_data_collection, | ||
data=item, | ||
user=user, | ||
bot=bot, | ||
timestamp=datetime.utcnow(), | ||
status=True | ||
).save() | ||
stale_ids = [] | ||
if sync_type == SyncType.push_menu: | ||
stale_ids = list(existing_ids - incoming_ids) | ||
if stale_ids: | ||
CollectionData.objects( | ||
collection_name=catalog_data_collection, | ||
bot=bot, | ||
status=True, | ||
data__id__in=stale_ids | ||
).delete() | ||
|
||
return stale_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enum misuse – save_ai_data
compares sync_type
to enum values incorrectly
The method expects sync_type: str
but compares it against SyncType.item_toggle
& SyncType.push_menu
(enum members). This always fails, so item-toggle updates and stale-ID deletion logic won’t run.
Either:
- if sync_type == SyncType.item_toggle:
+ if sync_type == SyncType.item_toggle.name:
and similarly for push_menu
, or accept an actual SyncType
enum instance.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@staticmethod | |
def save_ai_data(processed_data: dict, bot: str, user: str, sync_type: str): | |
""" | |
Syncs a document with Qdrant vector database by generating embeddings and upserting them. | |
Save each item in `kv` of the processed payload into CollectionData individually. | |
""" | |
restaurant_name, branch_name = CognitionDataProcessor.get_restaurant_and_branch_name(bot) | |
catalog_data_collection = f"{restaurant_name}_{branch_name}_catalog_data" | |
kv_items = processed_data.get("kv", []) | |
incoming_data_map = {item["id"]: item for item in kv_items} | |
incoming_ids = set(incoming_data_map.keys()) | |
existing_docs = CollectionData.objects( | |
collection_name=catalog_data_collection, | |
bot=bot, | |
status=True | |
) | |
existing_data_map = {doc.data.get("id"): doc for doc in existing_docs} | |
existing_ids = set(existing_data_map.keys()) | |
for item_id, item in incoming_data_map.items(): | |
if item_id in existing_data_map: | |
doc = existing_data_map[item_id] | |
if sync_type == SyncType.item_toggle: | |
for key, value in item.items(): | |
doc.data[key] = value | |
else: | |
doc.data = item | |
doc.timestamp = datetime.utcnow() | |
doc.user = user | |
doc.save() | |
else: | |
CollectionData( | |
collection_name=catalog_data_collection, | |
data=item, | |
user=user, | |
bot=bot, | |
timestamp=datetime.utcnow(), | |
status=True | |
).save() | |
stale_ids = [] | |
if sync_type == SyncType.push_menu: | |
stale_ids = list(existing_ids - incoming_ids) | |
if stale_ids: | |
CollectionData.objects( | |
collection_name=catalog_data_collection, | |
bot=bot, | |
status=True, | |
data__id__in=stale_ids | |
).delete() | |
return stale_ids | |
@staticmethod | |
def save_ai_data(processed_data: dict, bot: str, user: str, sync_type: str): | |
""" | |
Save each item in `kv` of the processed payload into CollectionData individually. | |
""" | |
restaurant_name, branch_name = CognitionDataProcessor.get_restaurant_and_branch_name(bot) | |
catalog_data_collection = f"{restaurant_name}_{branch_name}_catalog_data" | |
kv_items = processed_data.get("kv", []) | |
incoming_data_map = {item["id"]: item for item in kv_items} | |
incoming_ids = set(incoming_data_map.keys()) | |
existing_docs = CollectionData.objects( | |
collection_name=catalog_data_collection, | |
bot=bot, | |
status=True | |
) | |
existing_data_map = {doc.data.get("id"): doc for doc in existing_docs} | |
existing_ids = set(existing_data_map.keys()) | |
for item_id, item in incoming_data_map.items(): | |
if item_id in existing_data_map: | |
doc = existing_data_map[item_id] | |
# use the enum member’s name for comparison | |
if sync_type == SyncType.item_toggle.name: | |
for key, value in item.items(): | |
doc.data[key] = value | |
else: | |
doc.data = item | |
doc.timestamp = datetime.utcnow() | |
doc.user = user | |
doc.save() | |
else: | |
CollectionData( | |
collection_name=catalog_data_collection, | |
data=item, | |
user=user, | |
bot=bot, | |
timestamp=datetime.utcnow(), | |
status=True | |
).save() | |
stale_ids = [] | |
if sync_type == SyncType.push_menu.name: | |
stale_ids = list(existing_ids - incoming_ids) | |
if stale_ids: | |
CollectionData.objects( | |
collection_name=catalog_data_collection, | |
bot=bot, | |
status=True, | |
data__id__in=stale_ids | |
).delete() | |
return stale_ids |
kairon/shared/cognition/processor.py
Outdated
vector_id = str(uuid.uuid4()) if not existing_document else existing_document.get("vector_id") | ||
|
||
merged_data = row | ||
if existing_document: | ||
existing_data = existing_document.get("data", {}) | ||
merged_data = {**existing_data, **row} | ||
update_operations.append(UpdateOne( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle documents missing vector_id
to avoid Qdrant upsert failures
Legacy records may not contain vector_id
; appending None
(line 850) results in {'id': None, ...}
and an HTTP 400 from Qdrant.
- vector_id = str(uuid.uuid4()) if not existing_document else existing_document.get("vector_id")
+ if existing_document and existing_document.get("vector_id"):
+ vector_id = existing_document["vector_id"]
+ else:
+ vector_id = str(uuid.uuid4())
+ # back-fill vector_id on the existing Mongo document if previously absent
+ if existing_document:
+ update_operations.append(
+ UpdateOne({"_id": existing_document["_id"]},
+ {"$set": {"vector_id": vector_id}})
+ )
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
vector_id = str(uuid.uuid4()) if not existing_document else existing_document.get("vector_id") | |
merged_data = row | |
if existing_document: | |
existing_data = existing_document.get("data", {}) | |
merged_data = {**existing_data, **row} | |
update_operations.append(UpdateOne( | |
if existing_document and existing_document.get("vector_id"): | |
vector_id = existing_document["vector_id"] | |
else: | |
vector_id = str(uuid.uuid4()) | |
# back-fill vector_id on the existing Mongo document if previously absent | |
if existing_document: | |
update_operations.append( | |
UpdateOne({"_id": existing_document["_id"]}, | |
{"$set": {"vector_id": vector_id}}) | |
) | |
merged_data = row | |
if existing_document: | |
existing_data = existing_document.get("data", {}) | |
merged_data = {**existing_data, **row} | |
update_operations.append(UpdateOne( |
self._validate_sync_type(sync_type) | ||
try: | ||
integration = POSIntegrations.objects(bot= bot, provider = configuration['provider'], sync_type = sync_type).get() | ||
integration.config = configuration['config'] | ||
integration.meta_config = configuration['meta_config'] | ||
except DoesNotExist: | ||
integration = POSIntegrations(**configuration) | ||
integration.bot = bot | ||
integration.user = user | ||
integration.sync_type = sync_type | ||
integration.timestamp = datetime.utcnow() | ||
|
||
if 'meta_config' in configuration: | ||
integration.meta_config = configuration['meta_config'] | ||
|
||
integration.save() | ||
integration_endpoint = DataUtility.get_integration_endpoint(integration) | ||
return integration_endpoint | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sync_type
is optional but validated as mandatory – leads to false failures
save_pos_integration_config
declares sync_type: Text = None
, yet immediately calls _validate_sync_type(sync_type)
(line 688) which raises when sync_type is None
. Either make the argument mandatory or guard the validation.
- self._validate_sync_type(sync_type)
+ if sync_type:
+ self._validate_sync_type(sync_type)
+ else:
+ raise AppException("sync_type must be provided for POS integrations")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
self._validate_sync_type(sync_type) | |
try: | |
integration = POSIntegrations.objects(bot= bot, provider = configuration['provider'], sync_type = sync_type).get() | |
integration.config = configuration['config'] | |
integration.meta_config = configuration['meta_config'] | |
except DoesNotExist: | |
integration = POSIntegrations(**configuration) | |
integration.bot = bot | |
integration.user = user | |
integration.sync_type = sync_type | |
integration.timestamp = datetime.utcnow() | |
if 'meta_config' in configuration: | |
integration.meta_config = configuration['meta_config'] | |
integration.save() | |
integration_endpoint = DataUtility.get_integration_endpoint(integration) | |
return integration_endpoint | |
if sync_type: | |
self._validate_sync_type(sync_type) | |
else: | |
raise AppException("sync_type must be provided for POS integrations") | |
try: | |
integration = POSIntegrations.objects(bot= bot, provider = configuration['provider'], sync_type = sync_type).get() | |
integration.config = configuration['config'] | |
integration.meta_config = configuration['meta_config'] | |
except DoesNotExist: | |
integration = POSIntegrations(**configuration) | |
integration.bot = bot | |
integration.user = user | |
integration.sync_type = sync_type | |
integration.timestamp = datetime.utcnow() | |
if 'meta_config' in configuration: | |
integration.meta_config = configuration['meta_config'] | |
integration.save() | |
integration_endpoint = DataUtility.get_integration_endpoint(integration) | |
return integration_endpoint |
@staticmethod | ||
def preprocess_push_menu_data(bot, json_data, provider): | ||
""" | ||
Preprocess the JSON data received from Petpooja to extract relevant fields for knowledge base or meta synchronization. | ||
Handles different event types ("push_menu" vs others) and uses metadata to drive the field extraction and defaulting. | ||
""" | ||
doc = CatalogProviderMapping.objects(provider=provider).first() | ||
if not doc: | ||
raise Exception(f"Metadata mappings not found for provider={provider}") | ||
|
||
category_map = { | ||
cat["categoryid"]: cat["categoryname"] | ||
for cat in json_data.get("categories", []) | ||
} | ||
|
||
provider_mappings = { | ||
"meta": doc.meta_mappings, | ||
"kv": doc.kv_mappings | ||
} | ||
|
||
data = {sync_target: [] for sync_target in provider_mappings} | ||
for item in json_data.get("items", []): | ||
for sync_target, fields in provider_mappings.items(): | ||
transformed_item = {"id": item["itemid"]} | ||
|
||
for target_field, field_config in fields.items(): | ||
source_key = field_config.get("source") | ||
default_value = field_config.get("default") | ||
value = item.get(source_key) if source_key else None | ||
|
||
if target_field == "availability": | ||
value = "in stock" if int(value or 0) > 0 else default_value | ||
elif target_field == "facebook_product_category": | ||
category_id = value or "" | ||
value = f"Food and drink > {category_map.get(category_id, 'General')}" | ||
elif target_field == "image_url": | ||
value = CognitionDataProcessor.resolve_image_link(bot, item["itemid"]) | ||
elif target_field == "price": | ||
value = float(value) | ||
if not value: | ||
value = default_value | ||
|
||
transformed_item[target_field] = value | ||
|
||
data[sync_target].append(transformed_item) | ||
|
||
return data | ||
|
||
@staticmethod | ||
def preprocess_item_toggle_data(bot, json_data, provider): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Harden JSON preprocessing – unguarded casts may crash on real-world data
float(value)
(line 746) will raiseTypeError
/ValueError
whenvalue
isNone
, empty or non-numeric.int(value or 0)
(line 739) still raises ifvalue
is''
.- Generic
Exception
on missing mappings (line 716) breaks uniform error-handling.
Consider:
- if target_field == "availability":
- value = "in stock" if int(value or 0) > 0 else default_value
+ if target_field == "availability":
+ try:
+ value = "in stock" if int(value or 0) > 0 else default_value
+ except (TypeError, ValueError):
+ value = default_value
@@
- elif target_field == "price":
- value = float(value)
+ elif target_field == "price":
+ try:
+ value = float(value)
+ except (TypeError, ValueError):
+ value = default_value
Also replace bare Exception
(line 716) with AppException
for consistency.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@staticmethod | |
def preprocess_push_menu_data(bot, json_data, provider): | |
""" | |
Preprocess the JSON data received from Petpooja to extract relevant fields for knowledge base or meta synchronization. | |
Handles different event types ("push_menu" vs others) and uses metadata to drive the field extraction and defaulting. | |
""" | |
doc = CatalogProviderMapping.objects(provider=provider).first() | |
if not doc: | |
raise Exception(f"Metadata mappings not found for provider={provider}") | |
category_map = { | |
cat["categoryid"]: cat["categoryname"] | |
for cat in json_data.get("categories", []) | |
} | |
provider_mappings = { | |
"meta": doc.meta_mappings, | |
"kv": doc.kv_mappings | |
} | |
data = {sync_target: [] for sync_target in provider_mappings} | |
for item in json_data.get("items", []): | |
for sync_target, fields in provider_mappings.items(): | |
transformed_item = {"id": item["itemid"]} | |
for target_field, field_config in fields.items(): | |
source_key = field_config.get("source") | |
default_value = field_config.get("default") | |
value = item.get(source_key) if source_key else None | |
if target_field == "availability": | |
value = "in stock" if int(value or 0) > 0 else default_value | |
elif target_field == "facebook_product_category": | |
category_id = value or "" | |
value = f"Food and drink > {category_map.get(category_id, 'General')}" | |
elif target_field == "image_url": | |
value = CognitionDataProcessor.resolve_image_link(bot, item["itemid"]) | |
elif target_field == "price": | |
value = float(value) | |
if not value: | |
value = default_value | |
transformed_item[target_field] = value | |
data[sync_target].append(transformed_item) | |
return data | |
@staticmethod | |
def preprocess_item_toggle_data(bot, json_data, provider): | |
for target_field, field_config in fields.items(): | |
source_key = field_config.get("source") | |
default_value = field_config.get("default") | |
value = item.get(source_key) if source_key else None | |
if target_field == "availability": | |
- value = "in stock" if int(value or 0) > 0 else default_value | |
+ try: | |
+ value = "in stock" if int(value or 0) > 0 else default_value | |
+ except (TypeError, ValueError): | |
+ value = default_value | |
elif target_field == "facebook_product_category": | |
category_id = value or "" | |
value = f"Food and drink > {category_map.get(category_id, 'General')}" | |
elif target_field == "image_url": | |
value = CognitionDataProcessor.resolve_image_link(bot, item["itemid"]) | |
elif target_field == "price": | |
- value = float(value) | |
+ try: | |
+ value = float(value) | |
+ except (TypeError, ValueError): | |
+ value = default_value | |
if not value: | |
value = default_value | |
transformed_item[target_field] = value |
Removed unused imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
♻️ Duplicate comments (16)
kairon/api/app/routers/bot/data.py (1)
384-385
: 🛠️ Refactor suggestion
⚠️ Potential issueAdd defensive error-handling around provider-mapping load
load_catalog_provider_mappings()
performs I/O and DB writes; any failure will bubble up as an unhandled 500.
Wrap the call so the client receives a clear 4xx/5xx with context and the task can be retried gracefully.- CognitionDataProcessor.load_catalog_provider_mappings() + try: + CognitionDataProcessor.load_catalog_provider_mappings() + except Exception as e: + logger.exception("Failed to load catalog provider mappings") + raise AppException(f"Unable to load catalog provider mappings: {e}") from ekairon/api/app/routers/bot/integrations.py (2)
22-27
:examples
→example
and B008 (Path
/Security
in defaults) persist
The same issue was raised earlier but remains unresolved. FastAPI ignoresexamples
, and evaluatingPath
/Security
at import time is discouraged (Ruff B008).🧰 Tools
🪛 Ruff (0.8.2)
22-23: Do not perform function call
Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
24-24: Do not perform function call
Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
25-25: Do not perform function call
Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
26-26: Do not perform function call
Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable(B008)
33-34
: Harden JSON parsing
await request.json()
will raiseJSONDecodeError
and return a 500 on malformed
payloads. Catch the error and return a 400 with a helpful message.kairon/meta/processor.py (3)
18-45
: Two long-standing issues still present
self.processed_data
keeps growing – callingpreprocess_data
twice on the same instance duplicates previous items and inflates memory usage.
Reset it at the start of each call.Prices are coerced to
int
, silently discarding paise/cent values and violating Meta’s catalog spec.- for item in data: + # Start fresh each invocation + self.processed_data = [] + for item in data: @@ - value = int(item[field]) if field == "price" else item[field] + if field == "price": + value = str(item[field]) # keep decimals + else: + value = item[field]
105-108
: Hard-coded catalog ID breaks multi-tenant support
delete_meta_catalog
ignoresself.catalog_id
and points to a fixed ID,
causing deletions in the wrong account.-base_url = "https://graph.facebook.com/v21.0/1880697869060042/batch" +base_url = f"https://graph.facebook.com/v21.0/{self.catalog_id}/batch"
63-75
: 🛠️ Refactor suggestionBatch payload sent via URL – risk of URI-too-long & logging PII
The entire JSON batch is URL-encoded and appended to the query string.
Large menus quickly exceed the 2 KB/8 KB limits on many CDNs and leak
potentially sensitive catalog data into access logs.POST the payload in the body instead:
-req = quote(json.dumps(self.processed_data)) -base_url = f"https://graph.facebook.com/v21.0/{self.catalog_id}/batch" -url = f"{base_url}?item_type=PRODUCT_ITEM&requests={req}" - -data = {"access_token": self.access_token} -response = await asyncio.to_thread(requests.post, url, headers={}, data=data) +payload = { + "item_type": "PRODUCT_ITEM", + "requests": json.dumps(self.processed_data), + "access_token": self.access_token, +} +base_url = f"https://graph.facebook.com/v21.0/{self.catalog_id}/batch" +response = await asyncio.to_thread( + requests.post, base_url, data=payload, timeout=30 +)Also add a sensible timeout and reuse a common
headers
dict if needed.kairon/events/definitions/petpooja_sync.py (2)
81-82
:CognitionDataProcessor.save_ai_data
return value can beNone
; guard before iteration.You later pass
stale_primary_keys
to Meta deletion; aNone
will raiseTypeError
.stale_primary_keys = CognitionDataProcessor.save_ai_data(self.data, self.bot, self.user, self.sync_type) +stale_primary_keys = stale_primary_keys or []
159-162
:⚠️ Potential issueEnsure
stale_primary_keys
is not None before checkingThe code doesn't check if
stale_primary_keys
isNone
before using it in a conditional.This could lead to a TypeError if
stale_primary_keys
isNone
. Apply this fix:- if stale_primary_keys: + if stale_primary_keys and len(stale_primary_keys) > 0: delete_payload = meta_processor.preprocess_delete_data(stale_primary_keys) await meta_processor.delete_meta_catalog(delete_payload)kairon/shared/catalog_sync/catalog_sync_log_processor.py (4)
37-41
:add_log
may raiseMultipleObjectsReturned
and ignoresABORTED
status
get()
expects exactly one match; if two "in-progress" logs exist, it will raiseMultipleObjectsReturned
and crash.
Also, a sync inABORTED
state is considered "open" here but not inis_sync_in_progress
, creating inconsistent behaviour.- doc = CatalogSyncLogs.objects(bot=bot).filter( - Q(sync_status__ne=SYNC_STATUS.COMPLETED.value) & - Q(sync_status__ne=SYNC_STATUS.FAILED.value)).get() + doc = ( + CatalogSyncLogs.objects(bot=bot) + .filter( + Q(sync_status__ne=SYNC_STATUS.COMPLETED.value) + & Q(sync_status__ne=SYNC_STATUS.FAILED.value) + & Q(sync_status__ne=SYNC_STATUS.ABORTED.value) + ) + .order_by("-start_timestamp") + .first() + )
94-106
:is_limit_exceeded
crashes whenBotSettings
is missing
BotSettings.objects(bot=bot).get()
raisesDoesNotExist
; the except block is absent, so the caller gets an unhandled error unrelated to "limit exceeded".Suggestion:
- if doc_count >= BotSettings.objects(bot=bot).get().catalog_sync_limit_per_day: + try: + limit = BotSettings.objects(bot=bot).get().catalog_sync_limit_per_day + except DoesNotExist: + limit = 0 # or a sensible default + + if doc_count >= limit:
186-196
:validate_item_ids
promises a return value but returns nothingDocstring: "Returns a set of valid category IDs."
Implementation: performs validation only.Either return the set or fix the docstring. Returning is usually more useful:
- for item in json_data.get("items", []): + category_ids = set() + for item in json_data.get("items", []): if "itemid" not in item: raise Exception(f"Missing 'itemid' in item: {item}") + if "item_categoryid" in item: + category_ids.add(item["item_categoryid"]) + return category_ids
276-293
:⚠️ Potential issueNewly-created fallback image document immediately fails validation
The method inserts a document with an empty
image_url
, then validates and inevitably raises"Global fallback image URL not found"
.Apply this fix to prevent validation failure on newly created documents:
if not CollectionData.objects(bot=bot, collection_name=catalog_images_collection).first(): global_fallback_data = { "image_type": "global", - "image_url":"", + "image_url": "https://example.com/placeholder.jpg", # Use a placeholder or default URL "image_base64":"" } CollectionData( collection_name=catalog_images_collection, data=global_fallback_data, user=user, bot=bot, status=True, timestamp=datetime.utcnow() ).save() + return # First-time setup complete; let the caller fill the actual URL laterkairon/shared/cognition/processor.py (4)
822-833
:vector_id
can still beNone
, breaking Qdrant upsertsPrevious reviews pointed this out; the logic here hasn’t changed.
existing_document.get("vector_id")
returnsNone
for legacy docs, producing invalid points ({'id': None, …}
) that Qdrant rejects.See earlier suggestion for generating a UUID and back-filling MongoDB when a stored
vector_id
is absent.
920-940
:⚠️ Potential issueString-vs-enum comparison prevents
save_ai_data
branching
sync_type
is annotated asstr
, but comparisons are made againstSyncType
enum members.
These checks always evaluate toFalse
, so item-toggle merging and push-menu clean-up never execute.- if sync_type == SyncType.item_toggle: + if sync_type == SyncType.item_toggle.name: ... - if sync_type == SyncType.push_menu: + if sync_type == SyncType.push_menu.name:This bug silently skips critical logic and causes stale data to linger.
661-672
:⚠️ Potential issueOptional
sync_type
parameter crashes onNone
save_pos_integration_config
treatssync_type
as optional (None
default) yet immediately validates it.
Calling_validate_sync_type(None)
raisesAppException
, so every call that omits the argument fails.- self._validate_sync_type(sync_type) + if not sync_type: + raise AppException("`sync_type` must be provided for POS integration configuration") + self._validate_sync_type(sync_type)Alternatively, drop the default and make the argument mandatory in the signature.
Failure to address this will break the ingestion flow at runtime.
711-720
: 🛠️ Refactor suggestionUnprotected numeric casts may raise on real-world data
float(value)
andint(value or 0)
assume valid numeric strings. Empty strings,None
, or malformed numbers from Petpooja will raiseTypeError
/ValueError
, aborting the entire batch.- elif target_field == "availability": - value = "in stock" if int(value or 0) > 0 else default_value + elif target_field == "availability": + try: + value = "in stock" if int(value or 0) > 0 else default_value + except (TypeError, ValueError): + value = default_value @@ - elif target_field == "price": - value = float(value) + elif target_field == "price": + try: + value = float(value) + except (TypeError, ValueError): + value = default_valueGuarding these casts prevents unexpected 500s during ingestion.
🧹 Nitpick comments (6)
kairon/api/app/routers/bot/integrations.py (1)
17-18
: Remove unused variable
cognition_processor
is instantiated but never used in this module – delete it to avoid dead code and reader confusion.kairon/meta/processor.py (1)
15-17
: Unusedheaders
attribute
self.headers
is never referenced; either populate and reuse it for the
requests.post
calls or remove it to keep the class lean.tests/unit_test/data_processor/data_processor_test.py (2)
12-12
: Remove unused import.The
urljoin
function fromurllib.parse
is imported but not used anywhere in the file.-from urllib.parse import urljoin
🧰 Tools
🪛 Ruff (0.8.2)
12-12:
urllib.parse.urljoin
imported but unusedRemove unused import:
urllib.parse.urljoin
(F401)
1961-1961
: Remove extraneous f-prefix from string without placeholders.The f-string doesn't contain any placeholders, so the 'f' prefix is unnecessary.
-with pytest.raises(AppException, match=f"Sync type does not exist"): +with pytest.raises(AppException, match="Sync type does not exist"):🧰 Tools
🪛 Ruff (0.8.2)
1961-1961: f-string without any placeholders
Remove extraneous
f
prefix(F541)
kairon/shared/cognition/processor.py (2)
644-646
: Remove redundant.keys()
when checking enum membership
dict.__members__
already behaves like aset
-like view. Using.keys()
creates an unnecessary intermediate list and triggers the Ruff SIM118 warning.- if sync_type not in VaultSyncType.__members__.keys(): + if sync_type not in VaultSyncType.__members__: raise AppException("Sync type does not exist")🧰 Tools
🪛 Ruff (0.8.2)
644-644: Use
key not in dict
instead ofkey not in dict.keys()
Remove
.keys()
(SIM118)
892-893
: Typo in error messageMinor but worth fixing for clarity.
- await llm_processor.__delete_collection_points__(qdrant_collection, vector_ids, "Cannot delete stale points fro Qdrant!") + await llm_processor.__delete_collection_points__(qdrant_collection, vector_ids, "Cannot delete stale points from Qdrant!")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
kairon/api/app/routers/bot/data.py
(4 hunks)kairon/api/app/routers/bot/integrations.py
(1 hunks)kairon/events/definitions/petpooja_sync.py
(1 hunks)kairon/meta/processor.py
(1 hunks)kairon/shared/catalog_sync/catalog_sync_log_processor.py
(1 hunks)kairon/shared/catalog_sync/data_objects.py
(1 hunks)kairon/shared/cognition/processor.py
(5 hunks)tests/unit_test/action/action_test.py
(2 hunks)tests/unit_test/data_processor/data_processor_test.py
(25 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/unit_test/action/action_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- kairon/shared/catalog_sync/data_objects.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/unit_test/data_processor/data_processor_test.py (4)
kairon/shared/catalog_sync/data_objects.py (1)
CatalogProviderMapping
(29-35)kairon/shared/data/data_objects.py (1)
BotSyncConfig
(1108-1119)kairon/shared/models.py (1)
VaultSyncType
(120-122)kairon/shared/cognition/processor.py (8)
CognitionDataProcessor
(26-1014)get_restaurant_and_branch_name
(1008-1014)preprocess_push_menu_data
(682-727)preprocess_item_toggle_data
(730-747)resolve_image_link
(751-775)add_bot_sync_config
(988-1005)validate_data
(487-576)_validate_sync_type
(643-645)
kairon/api/app/routers/bot/integrations.py (6)
kairon/events/definitions/catalog_sync.py (3)
CatalogSync
(13-73)validate
(33-42)enqueue
(44-61)kairon/shared/auth.py (2)
Authentication
(30-465)get_current_user_and_bot
(83-109)kairon/shared/catalog_sync/data_objects.py (1)
CatalogSyncLogs
(9-24)kairon/shared/cognition/processor.py (1)
CognitionDataProcessor
(26-1014)kairon/shared/constants.py (1)
CatalogProvider
(127-128)kairon/shared/models.py (2)
User
(68-92)get_user
(84-87)
🪛 Ruff (0.8.2)
tests/unit_test/data_processor/data_processor_test.py
12-12: urllib.parse.urljoin
imported but unused
Remove unused import: urllib.parse.urljoin
(F401)
1961-1961: f-string without any placeholders
Remove extraneous f
prefix
(F541)
kairon/shared/cognition/processor.py
644-644: Use key not in dict
instead of key not in dict.keys()
Remove .keys()
(SIM118)
kairon/api/app/routers/bot/data.py
379-379: Do not perform function call Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
kairon/api/app/routers/bot/integrations.py
22-23: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
24-24: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
25-25: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
26-26: Do not perform function call Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
54-55: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
56-56: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
57-57: Do not perform function call Path
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
58-58: Do not perform function call Security
in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Python CI
- GitHub Check: Analyze (python)
- GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (6)
tests/unit_test/data_processor/data_processor_test.py (6)
1096-1223
: Test implementation looks robust with proper data setup and cleanup.The
test_preprocess_push_menu_data_success
test successfully:
- Sets up the required provider mappings and configuration
- Loads test data from a JSON file
- Creates fallback image data
- Verifies the preprocessing transforms data correctly
- Cleans up all test data afterward
This pattern is followed throughout the tests in this file, showing good testing practices with proper setup, verification, and cleanup.
1234-1286
: Test cases properly handle error scenarios.Both
test_preprocess_item_toggle_data_success
andtest_preprocess_item_toggle_data_no_provider_mapping
appropriately test both the happy path and the error case when provider mappings are missing, ensuring the code handles exceptions correctly.
1287-1412
: Complete image resolution test coverage.The image resolution tests cover all three scenarios:
- Resolving from a global fallback image
- Preferentially using a local image when available
- Handling the error case when no image is found
This ensures robust handling of image resolution for catalog items.
1414-1516
: Good configuration management testing.The bot sync configuration tests verify:
- Adding a new configuration works correctly
- Existing configurations aren't overwritten
- Restaurant and branch name retrieval functions properly
- Errors are raised when configurations don't exist
This comprehensive testing ensures the configuration management functionality is reliable.
2682-2691
: Good testing of validation functions.The tests for
_validate_sync_type
verify both success and failure cases:
- Valid sync types are accepted
- Invalid sync types raise the expected exception
These tests ensure the validation logic works correctly before it's used in more complex operations.
1800-1800
: Parameter renaming consistently applied throughout the tests.The parameter renaming from
event_type
tosync_type
has been consistently applied across all test methods, aligning with the model changes mentioned in the PR summary.Also applies to: 1849-1849, 1862-1862, 1943-1943, 1957-1957, 1975-1975, 1993-1993, 2023-2023, 2033-2033, 2060-2060, 2075-2075, 2102-2102, 2117-2117, 2160-2160, 2182-2182, 2244-2244, 2283-2283, 2361-2361, 2399-2399, 2457-2457
raise AppException("Invalid Provider") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect Enum membership check – always evaluates to False
CatalogSyncClass.__members__.values()
returns the Enum members, not their
.value
strings, while request_data.provider
is a plain string.
Use the _value2member_map_
helper or compare against a comprehension.
-if request_data.provider not in CatalogSyncClass.__members__.values():
+if request_data.provider not in CatalogSyncClass._value2member_map_:
raise AppException("Invalid Provider")
Without this fix every valid provider string will be rejected.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if request_data.provider not in CatalogSyncClass.__members__.values(): | |
raise AppException("Invalid Provider") | |
if request_data.provider not in CatalogSyncClass._value2member_map_: | |
raise AppException("Invalid Provider") |
except Exception as e: | ||
print(str(e)) | ||
execution_id = CatalogSyncLogProcessor.get_execution_id_for_bot(self.bot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace print with proper logging
Using print()
for error reporting doesn't integrate with the application's logging system. Use a logger instead.
- print(str(e))
+ logger.error(f"Error during catalog sync execution: {str(e)}")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except Exception as e: | |
print(str(e)) | |
execution_id = CatalogSyncLogProcessor.get_execution_id_for_bot(self.bot) | |
except Exception as e: | |
- print(str(e)) | |
+ logger.error(f"Error during catalog sync execution: {str(e)}") | |
execution_id = CatalogSyncLogProcessor.get_execution_id_for_bot(self.bot) |
def __init__(self, bot: Text, user: Text, provider: Text, **kwargs): | ||
""" | ||
Initialise event. | ||
""" | ||
self.bot = bot | ||
self.user = user | ||
self.provider = provider | ||
self.token = kwargs.get("token", "") | ||
self.sync_type = kwargs.get("sync_type", SyncType.item_toggle) | ||
self.data = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Initialize self.data
as a dictionary instead of a list
The self.data
attribute is initialized as an empty list, but it's accessed as a dictionary in the rest of the code (e.g., self.data.get("kv", [])
on line 76).
- self.data = []
+ self.data = {}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def __init__(self, bot: Text, user: Text, provider: Text, **kwargs): | |
""" | |
Initialise event. | |
""" | |
self.bot = bot | |
self.user = user | |
self.provider = provider | |
self.token = kwargs.get("token", "") | |
self.sync_type = kwargs.get("sync_type", SyncType.item_toggle) | |
self.data = [] | |
def __init__(self, bot: Text, user: Text, provider: Text, **kwargs): | |
""" | |
Initialise event. | |
""" | |
self.bot = bot | |
self.user = user | |
self.provider = provider | |
self.token = kwargs.get("token", "") | |
self.sync_type = kwargs.get("sync_type", SyncType.item_toggle) | |
self.data = {} |
stale_primary_keys = kwargs.get("stale_primary_keys") | ||
status = "Failure" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add null check for stale_primary_keys
stale_primary_keys
could be None
if coming from kwargs
, which would cause issues when used in conditionals later.
stale_primary_keys = kwargs.get("stale_primary_keys")
+ stale_primary_keys = stale_primary_keys or []
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
stale_primary_keys = kwargs.get("stale_primary_keys") | |
status = "Failure" | |
stale_primary_keys = kwargs.get("stale_primary_keys") | |
stale_primary_keys = stale_primary_keys or [] | |
status = "Failure" |
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot = self.bot, executionID = execution_id, | ||
sync_status=SYNC_STATUS.VALIDATING_FAILED, message = str(e), first_name = "HG" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace hardcoded email address
The email address is hardcoded. This should be configurable or retrieved from user/bot settings.
- mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot = self.bot, executionID = execution_id,
+ mail_type="catalog_sync_status", email=self.user, bot=self.bot, executionID=execution_id,
sync_status=SYNC_STATUS.VALIDATING_FAILED, message = str(e), first_name = "HG"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot = self.bot, executionID = execution_id, | |
sync_status=SYNC_STATUS.VALIDATING_FAILED, message = str(e), first_name = "HG" | |
) | |
mail_type="catalog_sync_status", email=self.user, bot=self.bot, executionID=execution_id, | |
sync_status=SYNC_STATUS.VALIDATING_FAILED, message=str(e), first_name="HG" |
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot, | ||
executionID=execution_id, | ||
sync_status=sync_status, message=str(e), first_name="HG" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace hardcoded email address
The email address is hardcoded. This should be configurable or retrieved from user/bot settings.
- mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot,
+ mail_type="catalog_sync_status", email=self.user, bot=self.bot,
executionID=execution_id,
sync_status=sync_status, message=str(e), first_name="HG"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot, | |
executionID=execution_id, | |
sync_status=sync_status, message=str(e), first_name="HG" | |
) | |
mail_type="catalog_sync_status", email=self.user, bot=self.bot, | |
executionID=execution_id, | |
sync_status=sync_status, message=str(e), first_name="HG" | |
) |
await MailUtility.format_and_send_mail( | ||
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot, | ||
executionID=execution_id, | ||
sync_status=sync_status, message=str(e), first_name="HG" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace hardcoded email address
The email address is hardcoded. This should be configurable or retrieved from user/bot settings.
- mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot,
+ mail_type="catalog_sync_status", email=self.user, bot=self.bot,
executionID=execution_id,
sync_status=sync_status, message=str(e), first_name="HG"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
await MailUtility.format_and_send_mail( | |
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot, | |
executionID=execution_id, | |
sync_status=sync_status, message=str(e), first_name="HG" | |
) | |
await MailUtility.format_and_send_mail( | |
mail_type="catalog_sync_status", email=self.user, bot=self.bot, | |
executionID=execution_id, | |
sync_status=sync_status, message=str(e), first_name="HG" | |
) |
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot, | ||
executionID=execution_id, | ||
sync_status=sync_status, message="Catalog has been synced successfully", first_name="HG" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace hardcoded email address
The email address is hardcoded. This should be configurable or retrieved from user/bot settings.
- mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot,
+ mail_type="catalog_sync_status", email=self.user, bot=self.bot,
executionID=execution_id,
sync_status=sync_status, message="Catalog has been synced successfully", first_name="HG"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
mail_type="catalog_sync_status", email="himanshu.gupta@nimblework.com", bot=self.bot, | |
executionID=execution_id, | |
sync_status=sync_status, message="Catalog has been synced successfully", first_name="HG" | |
) | |
mail_type="catalog_sync_status", email=self.user, bot=self.bot, | |
executionID=execution_id, | |
sync_status=sync_status, message="Catalog has been synced successfully", first_name="HG" | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/unit_test/data_processor/data_processor_test.py (2)
12-12
: Remove unused import.The
urllib.parse.urljoin
import is declared but never used in the code.-from urllib.parse import urljoin
🧰 Tools
🪛 Ruff (0.8.2)
12-12:
urllib.parse.urljoin
imported but unusedRemove unused import:
urllib.parse.urljoin
(F401)
1961-1961
: Remove unnecessary f-string.This string doesn't contain any placeholders, so the
f
prefix is unnecessary.- with pytest.raises(AppException, match=f"Sync type does not exist"): + with pytest.raises(AppException, match="Sync type does not exist"):🧰 Tools
🪛 Ruff (0.8.2)
1961-1961: f-string without any placeholders
Remove extraneous
f
prefix(F541)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/unit_test/data_processor/data_processor_test.py
(28 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/unit_test/data_processor/data_processor_test.py (4)
kairon/shared/catalog_sync/data_objects.py (1)
CatalogProviderMapping
(29-35)kairon/shared/data/data_objects.py (1)
BotSyncConfig
(1108-1119)kairon/shared/models.py (1)
VaultSyncType
(120-122)kairon/shared/cognition/processor.py (8)
CognitionDataProcessor
(26-1014)get_restaurant_and_branch_name
(1008-1014)preprocess_push_menu_data
(682-727)preprocess_item_toggle_data
(730-747)resolve_image_link
(751-775)add_bot_sync_config
(988-1005)validate_data
(487-576)_validate_sync_type
(643-645)
🪛 Ruff (0.8.2)
tests/unit_test/data_processor/data_processor_test.py
12-12: urllib.parse.urljoin
imported but unused
Remove unused import: urllib.parse.urljoin
(F401)
1961-1961: f-string without any placeholders
Remove extraneous f
prefix
(F541)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Python CI
🔇 Additional comments (6)
tests/unit_test/data_processor/data_processor_test.py (6)
1096-1219
: Good test coverage for the push menu preprocessing functionality.The test thoroughly validates the Petpooja data preprocessing logic, including provider mapping lookups, field transformations, and image resolution. The setup and cleanup are properly handled.
1224-1232
: Good error case handling.The test appropriately verifies that attempting to preprocess data with a non-existent provider mapping raises an exception.
1287-1327
: Comprehensive testing of image link resolution.The test suite covers all scenarios for image link resolution:
- Global fallback image
- Local item-specific image (takes precedence over global)
- Error handling when no images are found
This ensures the catalog sync feature will correctly handle various image availability scenarios.
Also applies to: 1329-1384, 1386-1412
1414-1444
: Well-structured bot sync configuration tests.These tests correctly validate both the initial configuration creation and the handling of existing configurations, ensuring data integrity.
Also applies to: 1446-1486
2267-2273
: Good mock assertion for embedding calls.The test properly verifies that the embedding function is called with the correct parameters, including the input data and metadata.
2503-2512
: Good unit tests for sync type validation.These tests properly check both valid and invalid sync type scenarios, ensuring the validation logic works correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
tests/unit_test/data_processor/catalog_sync_log_processor_test.py (4)
116-118
: Use specific exception type instead of generic Exception.Using generic
Exception
withpytest.raises()
can mask real issues by capturing unexpected exceptions. This was flagged in a previous review.- with pytest.raises(Exception): + with pytest.raises(AppException): # Use the appropriate exception type CatalogSyncLogProcessor.is_sync_in_progress(bot)🧰 Tools
🪛 Ruff (0.8.2)
116-116:
pytest.raises(Exception)
should be considered evil(B017)
124-134
: Use specific exception types and avoid bare except blocks.Both the bare
except
block and the genericException
inpytest.raises()
should be replaced with specific exception types. This was flagged in a previous review.try: bot_settings = BotSettings.objects(bot=bot).get() bot_settings.catalog_sync_limit_per_day = 0 - except: + except BotSettings.DoesNotExist: # Be specific about what exception you're catching bot_settings = BotSettings(bot=bot, catalog_sync_limit_per_day=0, user="test") bot_settings.save() - with pytest.raises(Exception): + with pytest.raises(AppException): # Use the appropriate exception type assert CatalogSyncLogProcessor.is_limit_exceeded(bot)🧰 Tools
🪛 Ruff (0.8.2)
129-129: Do not use bare
except
(E722)
132-132:
pytest.raises(Exception)
should be considered evil(B017)
246-247
: Use specific exception type instead of generic Exception.Using generic
Exception
withpytest.raises()
can mask unexpected errors. This was flagged in a previous review.- with pytest.raises(Exception): + with pytest.raises(AppException): # Or the specific exception type CatalogSyncLogProcessor.validate_item_ids(push_menu_payload)🧰 Tools
🪛 Ruff (0.8.2)
246-246:
pytest.raises(Exception)
should be considered evil(B017)
330-331
: Use specific exception type instead of generic Exception.Using generic
Exception
withpytest.raises()
can mask unexpected errors. This was flagged in a previous review.- with pytest.raises(Exception, match="Push menu processing is disabled for this bot"): + with pytest.raises(AppException, match="Push menu processing is disabled for this bot"): # Use the appropriate exception type CatalogSyncLogProcessor.is_sync_type_allowed(bot, SyncType.push_menu)
🧹 Nitpick comments (7)
tests/unit_test/data_processor/catalog_sync_log_processor_test.py (6)
7-7
: Remove unused import.The
uni_escape_re
frombabel.messages.jslexer
is imported but not used anywhere in the code.-from babel.messages.jslexer import uni_escape_re
🧰 Tools
🪛 Ruff (0.8.2)
7-7:
babel.messages.jslexer.uni_escape_re
imported but unusedRemove unused import:
babel.messages.jslexer.uni_escape_re
(F401)
11-11
: Remove unused import.
AppException
is imported but never used in this file.-from kairon.exceptions import AppException
🧰 Tools
🪛 Ruff (0.8.2)
11-11:
kairon.exceptions.AppException
imported but unusedRemove unused import:
kairon.exceptions.AppException
(F401)
15-15
: Remove unused import.
CognitionDataProcessor
is imported but never used in this file.-from kairon.shared.cognition.processor import CognitionDataProcessor
🧰 Tools
🪛 Ruff (0.8.2)
15-15:
kairon.shared.cognition.processor.CognitionDataProcessor
imported but unusedRemove unused import:
kairon.shared.cognition.processor.CognitionDataProcessor
(F401)
24-24
: Use uppercase for environment variable names.Environment variables should follow the convention of using uppercase names.
- os.environ["system_file"] = "./tests/testing_data/system.yaml" + os.environ["SYSTEM_FILE"] = "./tests/testing_data/system.yaml"Make sure to update any code that references this environment variable elsewhere.
🧰 Tools
🪛 Ruff (0.8.2)
24-24: Use capitalized environment variable
SYSTEM_FILE
instead ofsystem_file
Replace
system_file
withSYSTEM_FILE
(SIM112)
232-237
: Handle exceptions properly and avoid unused variables.The variable
e
is assigned but never used. Consider either logging the exception or using the exception message in your assertion.try: CatalogSyncLogProcessor.validate_item_ids(push_menu_payload) itemid_missing = False - except Exception as e: + except Exception: # If you're not using the exception, don't capture it itemid_missing = TrueBetter yet, use pytest's assertion mechanism for more explicit testing:
- try: - CatalogSyncLogProcessor.validate_item_ids(push_menu_payload) - itemid_missing = False - except Exception as e: - itemid_missing = True - - assert itemid_missing is False + # This doesn't raise an exception when valid + CatalogSyncLogProcessor.validate_item_ids(push_menu_payload)🧰 Tools
🪛 Ruff (0.8.2)
235-235: Local variable
e
is assigned to but never usedRemove assignment to unused variable
e
(F841)
241-247
: Add a test for the exact exception message when validating item IDs.The test verifies that an exception is raised, but doesn't check the exception message. Adding a match parameter would make the test more precise.
with push_menu_payload_path.open("r", encoding="utf-8") as f: push_menu_payload = json.load(f) - with pytest.raises(Exception): + with pytest.raises(AppException, match="itemID is missing for one or more items"): # Match the expected error message CatalogSyncLogProcessor.validate_item_ids(push_menu_payload)🧰 Tools
🪛 Ruff (0.8.2)
246-246:
pytest.raises(Exception)
should be considered evil(B017)
tests/unit_test/data_processor/data_processor_test.py (1)
1155-1220
: Make equality assertion order-agnostic
assert result == expected_result
will fail when the processor returns the same items in a different order (e.g., when a DB query or dict iteration order changes).
Compare the lists after sorting byid
instead, keeping the test robust to non-deterministic ordering.- assert result == expected_result + # Order-independent comparison + assert sorted(result["meta"], key=lambda x: x["id"]) == \ + sorted(expected_result["meta"], key=lambda x: x["id"]) + assert sorted(result["kv"], key=lambda x: x["id"]) == \ + sorted(expected_result["kv"], key=lambda x: x["id"])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/unit_test/data_processor/catalog_sync_log_processor_test.py
(1 hunks)tests/unit_test/data_processor/data_processor_test.py
(29 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/unit_test/data_processor/catalog_sync_log_processor_test.py
7-7: babel.messages.jslexer.uni_escape_re
imported but unused
Remove unused import: babel.messages.jslexer.uni_escape_re
(F401)
11-11: kairon.exceptions.AppException
imported but unused
Remove unused import: kairon.exceptions.AppException
(F401)
15-15: kairon.shared.cognition.processor.CognitionDataProcessor
imported but unused
Remove unused import: kairon.shared.cognition.processor.CognitionDataProcessor
(F401)
24-24: Use capitalized environment variable SYSTEM_FILE
instead of system_file
Replace system_file
with SYSTEM_FILE
(SIM112)
116-116: pytest.raises(Exception)
should be considered evil
(B017)
129-129: Do not use bare except
(E722)
132-132: pytest.raises(Exception)
should be considered evil
(B017)
235-235: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
246-246: pytest.raises(Exception)
should be considered evil
(B017)
tests/unit_test/data_processor/data_processor_test.py
12-12: urllib.parse.urljoin
imported but unused
Remove unused import: urllib.parse.urljoin
(F401)
1961-1961: f-string without any placeholders
Remove extraneous f
prefix
(F541)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Analyze (python)
- GitHub Check: Python CI
🔇 Additional comments (1)
tests/unit_test/data_processor/catalog_sync_log_processor_test.py (1)
1-572
: Comprehensive test suite for catalog sync functionality.This is a well-structured and thorough test suite that covers all the critical functionality of the CatalogSyncLogProcessor. The tests verify log creation, status tracking, validation logic, configuration checking, and error handling - all essential for a robust catalog synchronization system.
I especially appreciate:
- Setting up and tearing down test data for each test case
- Testing both successful and error conditions
- Validating specific error messages
- Testing integration with dependent components like BotSettings and CognitionSchema
Good test coverage lays a solid foundation for the catalog integration between Petpooja (POS) and Meta as outlined in the PR objectives.
🧰 Tools
🪛 Ruff (0.8.2)
7-7:
babel.messages.jslexer.uni_escape_re
imported but unusedRemove unused import:
babel.messages.jslexer.uni_escape_re
(F401)
11-11:
kairon.exceptions.AppException
imported but unusedRemove unused import:
kairon.exceptions.AppException
(F401)
15-15:
kairon.shared.cognition.processor.CognitionDataProcessor
imported but unusedRemove unused import:
kairon.shared.cognition.processor.CognitionDataProcessor
(F401)
24-24: Use capitalized environment variable
SYSTEM_FILE
instead ofsystem_file
Replace
system_file
withSYSTEM_FILE
(SIM112)
116-116:
pytest.raises(Exception)
should be considered evil(B017)
129-129: Do not use bare
except
(E722)
132-132:
pytest.raises(Exception)
should be considered evil(B017)
235-235: Local variable
e
is assigned to but never usedRemove assignment to unused variable
e
(F841)
246-246:
pytest.raises(Exception)
should be considered evil(B017)
from pathlib import Path | ||
from typing import List | ||
from urllib.parse import urljoin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused import to satisfy Ruff F401
urljoin
is imported but never used anywhere in the test module. Ruff flags this as F401
, which will fail lint-only CI stages.
-from urllib.parse import urljoin
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from pathlib import Path | |
from typing import List | |
from urllib.parse import urljoin | |
from pathlib import Path | |
from typing import List |
🧰 Tools
🪛 Ruff (0.8.2)
12-12: urllib.parse.urljoin
imported but unused
Remove unused import: urllib.parse.urljoin
(F401)
|
||
processor = CognitionDataProcessor() | ||
|
||
with pytest.raises(AppException, match=f"Event type does not exist"): | ||
with pytest.raises(AppException, match=f"Sync type does not exist"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extraneous f-string prefix
There are no interpolations in the string, so the f
prefix is unnecessary and triggers Ruff F541
.
-with pytest.raises(AppException, match=f"Sync type does not exist"):
+with pytest.raises(AppException, match="Sync type does not exist"):
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
with pytest.raises(AppException, match=f"Sync type does not exist"): | |
with pytest.raises(AppException, match="Sync type does not exist"): |
🧰 Tools
🪛 Ruff (0.8.2)
1961-1961: f-string without any placeholders
Remove extraneous f
prefix
(F541)
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation
Chores