Skip to content

Conversation

Ravishekhar7870
Copy link
Contributor

@Ravishekhar7870 Ravishekhar7870 commented Sep 19, 2025

Description

unlink file from disk after transaction of postgress and vespa ingestion of the file fails

Testing

locally tested

Additional Notes

Summary by CodeRabbit

  • New Features

    • Batch uploads now report precise results: “Successfully added X out of Y files” for both new-collection and add-file flows.
    • Uploads now persist additional storage metadata to improve traceability.
  • Bug Fixes

    • Replaced unconditional success message with failure notice when no files upload.
    • Removes leftover files from failed uploads to avoid clutter.
    • Safer handling when collection metadata (like name) is missing.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 19, 2025

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds frontend accumulation and conditional toasts for partial-success batch uploads; server upload handler now uses mutable storageKey/storagePath, persists storagePath in Vespa docs, and attempts per-file on-disk cleanup on failures while adding richer file metadata.

Changes

Cohort / File(s) Summary of changes
Frontend: batch upload feedback
frontend/src/routes/_authenticated/knowledgeManagement.tsx
Introduces successfullyUploadedFiles accumulator; captures per-batch uploadedResult.summary.successful; conditionally shows success toast ("Successfully added X out of Y files") or failure toast; applies to both new-collection and add-files flows; minor formatting tweaks.
Server: upload storage & metadata
server/api/knowledgeBase.ts
Makes storageKey/storagePath mutable and reused for writes and Vespa docs; computes storagePath via resolver; includes storagePath and richer metadata (originalFileName, uploadedBy, chunk counts, processingMethod, etc.) in Vespa docs; attempts unlink on per-file failures and logs cleanup errors; uses collection?.name when composing fileName; minor quoting/formatting edits.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant FE as Frontend (KnowledgeManagement)
  participant S as Server API (Upload)
  participant FS as Filesystem
  participant V as Vespa
  participant LOG as Logger

  U->>FE: Select files & start upload
  loop per batch
    FE->>S: POST batch
    S->>FS: write file -> storagePath
    alt write success
      S->>V: upsert doc (includes storagePath, metadata)
      S-->>FE: UploadedResult { summary.successful }
    else write/error
      S->>FS: unlink(storagePath) for failed file
      S->>LOG: warn if unlink fails
      S-->>FE: UploadedResult { summary.successful }
    end
    FE->>FE: accumulate summary.successful
  end
  alt any successful
    FE-->>U: Toast "Successfully added X out of Y files"
  else all failed
    FE-->>U: Toast "Add Files Failed"
  end
Loading
sequenceDiagram
  autonumber
  participant S as Server Upload Handler
  participant PR as Path Resolver
  participant FS as Filesystem
  participant V as Vespa
  participant LOG as Logger

  S->>PR: resolve storagePath(file, collection?)
  PR-->>S: storagePath (mutable)
  S->>FS: stream write -> storagePath
  alt success
    S->>V: upsert doc { fileName: collection?.name?, storagePath, metadata }
    V-->>S: ack
    S-->>Caller: success summary
  else failure
    S->>FS: unlink(storagePath)
    alt unlink error
      S->>LOG: warn cleanup failed
    end
    S-->>Caller: partial/failed summary
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • zereraz
  • shivamashtikar
  • kalpadhwaryu
  • junaid-shirur
  • devesh-juspay

Poem

I counted bytes with whiskered care,
Bits tucked in burrows, storage to spare.
Some slipped away — I swept the floor,
Then toasted “X of Y” and hopped for more. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title correctly identifies the primary fix — unlinking/deleting files from disk when the PostgreSQL transaction or Vespa ingestion fails — and aligns with the PR description and server-side changes, so it is related to the main changes; however the title is somewhat long and contains a misspelling ("postgress") and awkward phrasing that could be tightened for clarity.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 48dfbe9 and 940a30a.

📒 Files selected for processing (1)
  • frontend/src/routes/_authenticated/knowledgeManagement.tsx (6 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Ravishekhar7870, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where files could be left on the disk if their processing and ingestion into the database or search index failed. It introduces a cleanup mechanism to ensure that such orphaned files are removed, thereby improving data consistency and resource management. Concurrently, the user experience for file uploads has been refined to offer more precise feedback on the success or failure of batch uploads.

Highlights

  • Robust File Cleanup: Implemented logic to automatically delete files from disk if their subsequent ingestion into PostgreSQL or Vespa fails, preventing orphaned files and ensuring data consistency.
  • Improved User Feedback for File Uploads: Enhanced frontend toast notifications to provide more granular feedback, showing the exact number of successfully uploaded files in a batch and displaying a specific failure message when no files are added.
  • Frontend Progress Tracking: Introduced a mechanism to track and display the count of successfully uploaded files during the batch processing on the frontend.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly implements a cleanup mechanism to unlink files from disk if the ingestion transaction fails, preventing orphaned files. The frontend changes also improve user feedback by showing the number of successfully uploaded files. My review focuses on improving code quality and maintainability. I've suggested using the structured logger instead of console.log for better error tracking on the server, removing an unused import, and several minor improvements to code style and readability for consistency with common practices.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
frontend/src/routes/_authenticated/knowledgeManagement.tsx (4)

770-781: Propagate partial‑success handling to “create new collection” flow.

Right now you ignore the per‑batch result and always toast as if all files succeeded. Mirror the add‑to‑collection logic to avoid misleading UX.

Apply this diff:

       for (let i = 0; i < batches.length; i++) {
         setBatchProgress((prev: typeof batchProgress) => ({
           ...prev,
           batch: i + 1,
         }))
         const batchFiles = batches[i].map((f) => f.file)
-       await uploadFileBatch(batchFiles, cl.id)
+       const uploadedResult = await uploadFileBatch(batchFiles, cl.id)
+       const successfulInBatch = Number(uploadedResult?.summary?.successful ?? 0)
+       // Optionally accumulate if you want to reflect partial success in the final toast
+       // keep a running counter declared before the loop: let successfullyUploadedFiles = 0
+       successfullyUploadedFiles += successfulInBatch
         setBatchProgress((prev: typeof batchProgress) => ({
           ...prev,
           current: prev.current + batchFiles.length,
         }))
       }

And later adjust the toast (see next comment) to reflect partial success.


835-840: Toast should reflect actual upload outcome (partial successes).

Use the accumulated successful count or updatedCl.totalCount rather than selectedFiles.length.

Apply this diff:

-      showToast(
-        "Knowledge Base Created",
-        `Successfully created knowledge base "${collectionName.trim()}" with ${selectedFiles.length} files.`,
-      )
+      const totalRequested = selectedFiles.length
+      const totalSucceeded = typeof successfullyUploadedFiles === 'number'
+        ? successfullyUploadedFiles
+        : Number(updatedCl?.totalCount ?? 0)
+      showToast(
+        "Knowledge Base Created",
+        `Added ${totalSucceeded} of ${totalRequested} files to "${collectionName.trim()}".`
+      )

904-930: Defensive read of API response to avoid runtime errors on non‑JSON bodies.

uploadFileBatch can return a non‑JSON fallback object; directly accessing uploadedResult.summary.successful can throw. Guard it.

Apply this diff:

-      let successfullyUploadedFiles=0;
+      let successfullyUploadedFiles = 0;
       const batches = createBatches(selectedFiles, addingToCollection.name)
@@
-        const batchFiles = batches[i].map((f) => f.file)
-      const uploadedResult=  await uploadFileBatch(
+        const batchFiles = batches[i].map((f) => f.file)
+        const uploadedResult = await uploadFileBatch(
           batchFiles,
           addingToCollection.id,
           targetFolder?.id,
         )
-        successfullyUploadedFiles=successfullyUploadedFiles+ uploadedResult.summary.successful;
+        const successfulInBatch = Number(uploadedResult?.summary?.successful ?? 0)
+        successfullyUploadedFiles += successfulInBatch

976-984: Simplify conditional toast; avoid ternary used for side‑effects.

Readability nit: prefer a simple if/else.

Apply this diff:

-      successfullyUploadedFiles? showToast(
-        "Files Added",
-        `Successfully added ${successfullyUploadedFiles} out of ${selectedFiles.length} files to collection "${addingToCollection.name}".`,
-      ):
-      showToast(
-        "Add Files Failed",
-        "Failed to add files to collection. Please try again.",
-        true,
-      )
+      if (successfullyUploadedFiles > 0) {
+        showToast(
+          "Files Added",
+          `Successfully added ${successfullyUploadedFiles} out of ${selectedFiles.length} files to collection "${addingToCollection.name}".`
+        )
+      } else {
+        showToast(
+          "Add Files Failed",
+          "Failed to add files to collection. Please try again.",
+          true
+        )
+      }
server/api/knowledgeBase.ts (3)

52-53: Remove unused import.

storage from googleapis is unused and may fail lint/noUnusedLocals.

Apply this diff:

-import { storage } from "googleapis/build/src/apis/storage"

1243-1246: Unnecessary optional chaining; also normalize composed path.

collection is already validated; optional chaining is noise. Consider precomputing path to avoid duplicate logic.

Apply this diff:

-            fileName:
-              targetPath === "/"
-                ? collection?.name + targetPath + filePath
-                : collection?.name + targetPath + fileName,
+            fileName: collection.name + targetPath + (targetPath === "/" ? filePath : fileName),

1298-1306: Use structured logger for cleanup errors; avoid console.log.

Aligns with existing logging strategy and preserves context.

Apply this diff:

-        if(storagePath){
-          try{
-            await unlink(storagePath);
-          }
-          catch(err){
-            console.log(`Failed to clean up storage file at ${storagePath}: ${getErrorMessage(err)}`);
-          }
-          
-        }
+        if (storagePath) {
+          try {
+            await unlink(storagePath)
+          } catch (err) {
+            loggerWithChild({ email: userEmail }).warn(
+              `Failed to clean up storage file at ${storagePath}: ${getErrorMessage(err)}`
+            )
+          }
+        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ddd939 and b6c8c39.

📒 Files selected for processing (2)
  • frontend/src/routes/_authenticated/knowledgeManagement.tsx (5 hunks)
  • server/api/knowledgeBase.ts (8 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-16T11:56:22.752Z
Learnt from: oindrila-b
PR: xynehq/xyne#545
File: server/integrations/google/index.ts:137-145
Timestamp: 2025-06-16T11:56:22.752Z
Learning: In server/integrations/google/index.ts, both Logger (base logger) and loggerWithChild (factory for email-scoped child loggers) are intentionally maintained to handle different logging scenarios - one for when email context is not available and one for when it is available.

Applied to files:

  • server/api/knowledgeBase.ts
🧬 Code graph analysis (2)
frontend/src/routes/_authenticated/knowledgeManagement.tsx (1)
frontend/src/utils/fileUtils.ts (1)
  • uploadFileBatch (222-269)
server/api/knowledgeBase.ts (2)
server/db/knowledgeBase.ts (2)
  • generateStorageKey (763-765)
  • generateFileVespaDocId (768-770)
server/utils.ts (1)
  • getErrorMessage (89-92)
🔇 Additional comments (2)
server/api/knowledgeBase.ts (2)

1019-1205: Good: on‑disk rollback on failure now in place.

Declaring storagePath before write and unlinking on error covers DB/Vespa/processing failures; this meets the PR objective.

Please confirm we also cover failures after DB commit but before response write (rare). If needed, consider a best‑effort compensating task/queue to unlink orphaned files discovered later.


1184-1189: Incorrect — workspaceExternalId is non-null on SelectUser; no change needed.

selectUserSchema defines workspaceExternalId as notNull (server/db/schema/users.ts) and user objects returned by getUserByEmail include it; the codebase already relies on user.workspaceExternalId (e.g., server/api/knowledgeBase.ts:1184–1189), so passing user.workspaceExternalId to getStoragePath is safe.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
server/api/knowledgeBase.ts (2)

52-52: Remove unused import.

The storage import from googleapis is not used anywhere in this file and should be removed to keep the code clean.

-import { storage } from "googleapis/build/src/apis/storage"

1298-1308: Fix error logging and use structured logger.

The error handling for file cleanup has syntax issues and uses incomplete logging. The loggerWithChild call is malformed and missing proper parameters.

         if (storagePath) {
           try {
             await unlink(storagePath)
           } catch (err) {
-           loggerWithChild({ email: userEmail,  }).error(
-            error,
-              `Failed to clean up storage file`
-            );
+            loggerWithChild({ 
+              email: userEmail, 
+              storagePath, 
+              error: getErrorMessage(err) 
+            }).warn(
+              `Failed to clean up storage file`
+            )
           }
         }
🧹 Nitpick comments (4)
server/api/knowledgeBase.ts (4)

1022-1022: Initialize variable with proper spacing.

For consistency and readability, add spaces around the assignment operator.

-      let storagePath = ""
+      let storagePath = ""

1181-1181: Consider using const for immutable variable.

Since storageKey doesn't appear to be reassigned after generation, consider using const instead of let.

-        let storageKey = generateStorageKey()
+        const storageKey = generateStorageKey()

1619-1637: Verify Vespa response structure handling.

The code directly accesses resp.fields properties without proper null checks on intermediate objects. While there are some checks, they could be more comprehensive.

-    const index = resp.fields.chunks_pos.findIndex(
-      (pos: number) => pos === chunkIndex,
-    )
+    const chunksPos = resp.fields.chunks_pos
+    const chunks = resp.fields.chunks
+    
+    if (!Array.isArray(chunksPos) || !Array.isArray(chunks)) {
+      throw new HTTPException(404, { message: "Invalid chunk data format" })
+    }
+    
+    const index = chunksPos.findIndex((pos: number) => pos === chunkIndex)

1772-1787: Review filename sanitization logic.

The filename sanitization functions have potential issues:

  1. Line 1779: The regex replacement might be too aggressive
  2. Line 1785-1786: The chaining of escape and unescape calls seems redundant
     function sanitizeFilename(name: string): string {
-      // Replace non-ASCII and problematic characters with '_'
-      return name.replace(/[^\x20-\x7E]|["\\]/g, "_")
+      // Replace control characters and problematic characters with '_'
+      return name.replace(/[\x00-\x1f\x7f-\x9f"\\<>:|?*]/g, "_")
     }
     
     // RFC 5987 encoding for filename*
     function encodeRFC5987ValueChars(str: string): string {
-      return encodeURIComponent(str)
-        .replace(/['()]/g, escape)
-        .replace(/%(7C|60|5E)/g, unescape)
+      return encodeURIComponent(str)
+        .replace(/['()]/g, (c) => `%${c.charCodeAt(0).toString(16).toUpperCase()}`)
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6c8c39 and 3d98484.

📒 Files selected for processing (1)
  • server/api/knowledgeBase.ts (11 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-16T11:56:22.752Z
Learnt from: oindrila-b
PR: xynehq/xyne#545
File: server/integrations/google/index.ts:137-145
Timestamp: 2025-06-16T11:56:22.752Z
Learning: In server/integrations/google/index.ts, both Logger (base logger) and loggerWithChild (factory for email-scoped child loggers) are intentionally maintained to handle different logging scenarios - one for when email context is not available and one for when it is available.

Applied to files:

  • server/api/knowledgeBase.ts
🧬 Code graph analysis (1)
server/api/knowledgeBase.ts (1)
server/db/knowledgeBase.ts (3)
  • getCollectionItemsByParent (142-164)
  • generateStorageKey (763-765)
  • generateFileVespaDocId (768-770)
🔇 Additional comments (2)
server/api/knowledgeBase.ts (2)

1686-1686: Handle potential undefined originalName gracefully.

The fallback to "file" when originalName is missing is good defensive programming.


1244-1245: Replace optional chaining with direct access — collection is validated earlier.

This handler calls getCollectionById and immediately throws if not found (server/api/knowledgeBase.ts:903–906), so collection is non-null when the Vespa doc is created (server/api/knowledgeBase.ts ≈1160–1260 / original lines 1244–1245). Use collection.name (not collection?.name) to avoid undefined prefixes in Vespa fileName.

SahilKumar000
SahilKumar000 previously approved these changes Sep 19, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
server/api/knowledgeBase.ts (1)

1297-1307: Cleanup logging uses wrong error variable and misses context; treat ENOENT as non-issue.

You're logging error from the outer catch instead of err, and severity should be warn. Include storagePath and the unlink error message.

Apply this diff:

-        if (storagePath) {
-          try {
-            await unlink(storagePath)
-          } catch (err) {
-           loggerWithChild({ email: userEmail,  }).error(
-            error,
-              `Failed to clean up storage file`
-            );
-
-          }
-        }
+        if (storagePath) {
+          try {
+            await unlink(storagePath)
+          } catch (err) {
+            // Ignore missing file; warn on other errors
+            if ((err as NodeJS.ErrnoException)?.code !== 'ENOENT') {
+              loggerWithChild({ email: userEmail, storagePath, error: getErrorMessage(err) })
+                .warn(`Failed to clean up storage file`)
+            }
+          }
+        }
🧹 Nitpick comments (5)
server/api/knowledgeBase.ts (5)

1180-1180: Prefer const for immutable storageKey.

storageKey is never reassigned; use const.

-        let storageKey = generateStorageKey()
+        const storageKey = generateStorageKey()

1021-1021: Use null sentinel for unset path.

"" can be mistaken for a valid path. Prefer null and check truthiness accordingly.

-      let storagePath = ""
+      let storagePath: string | null = null

1242-1245: Inconsistent Vespa fileName composition; avoid mixing filePath.

Using filePath in the root case can double-embed folder segments if logic changes. Build uniformly from collection.name, targetPath, and fileName.

Apply this diff:

-            fileName:
-              targetPath === "/"
-                ? collection?.name + targetPath + filePath
-                : collection?.name + targetPath + fileName,
+            fileName: `${collection!.name}${targetPath}${fileName}`,

249-283: Potential N+1 when includeItems=true.

Fetching items per collection can be heavy for large lists. Consider a batched query by collection IDs and grouping in memory, or gate behind a max collections threshold.

If interested, I can sketch a batched query using a single SELECT with collectionId IN (...) and grouping.


1777-1786: Avoid deprecated escape/unescape in RFC5987 encoding helper.

escape/unescape are deprecated. Use a safe replacer.

-    function encodeRFC5987ValueChars(str: string): string {
-      return encodeURIComponent(str)
-        .replace(/['()]/g, escape)
-        .replace(/%(7C|60|5E)/g, unescape)
-    }
+    function encodeRFC5987ValueChars(str: string): string {
+      // Encode as per RFC5987 without deprecated APIs
+      return encodeURIComponent(str)
+        .replace(/[!'()*]/g, (c) => `%${c.charCodeAt(0).toString(16).toUpperCase()}`)
+        .replace(/%20/g, '%20')
+    }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc3ff4c and 48dfbe9.

📒 Files selected for processing (2)
  • frontend/src/routes/_authenticated/knowledgeManagement.tsx (6 hunks)
  • server/api/knowledgeBase.ts (10 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • frontend/src/routes/_authenticated/knowledgeManagement.tsx
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-16T08:57:58.762Z
Learnt from: rahul1841
PR: xynehq/xyne#843
File: server/server.ts:996-996
Timestamp: 2025-09-16T08:57:58.762Z
Learning: The DownloadFileApi handler in server/api/knowledgeBase.ts properly uses the clId route parameter to scope and validate file access within the correct collection.

Applied to files:

  • server/api/knowledgeBase.ts
📚 Learning: 2025-09-16T08:57:58.762Z
Learnt from: rahul1841
PR: xynehq/xyne#843
File: server/server.ts:996-996
Timestamp: 2025-09-16T08:57:58.762Z
Learning: The DownloadFileApi handler in server/api/knowledgeBase.ts properly uses the clId route parameter to scope and validate file access within the correct collection, including fetching the collection, checking permissions, and verifying the file belongs to that collection.

Applied to files:

  • server/api/knowledgeBase.ts
📚 Learning: 2025-06-16T11:56:22.752Z
Learnt from: oindrila-b
PR: xynehq/xyne#545
File: server/integrations/google/index.ts:137-145
Timestamp: 2025-06-16T11:56:22.752Z
Learning: In server/integrations/google/index.ts, both Logger (base logger) and loggerWithChild (factory for email-scoped child loggers) are intentionally maintained to handle different logging scenarios - one for when email context is not available and one for when it is available.

Applied to files:

  • server/api/knowledgeBase.ts
🧬 Code graph analysis (1)
server/api/knowledgeBase.ts (1)
server/db/knowledgeBase.ts (3)
  • getCollectionItemsByParent (142-164)
  • generateStorageKey (763-765)
  • generateFileVespaDocId (768-770)
🔇 Additional comments (3)
server/api/knowledgeBase.ts (3)

1619-1636: Good defensive checks for Vespa response shape.

The added null/type guards and index lookup reduce 404s and undefined access.


1685-1685: Content-Disposition filename=UTF-8 encoding LGTM.*

Safer for non-ASCII filenames in previews.


1184-1189: Use user.workspaceId (not workspaceExternalId) for storage path

getStoragePath expects a workspace ID; passing user.workspaceExternalId (unused elsewhere) will likely be undefined at runtime.

File: server/api/knowledgeBase.ts Lines: 1184-1189

-        storagePath = getStoragePath(
-          user.workspaceExternalId,
+        storagePath = getStoragePath(
+          user.workspaceId,
           collectionId,
           storageKey,
           fileName,
         )

@junaid-shirur junaid-shirur merged commit 10585d6 into main Sep 19, 2025
3 of 4 checks passed
@junaid-shirur junaid-shirur deleted the fix/unlink-file-from-disk branch September 19, 2025 12:58
Ravishekhar7870 added a commit that referenced this pull request Sep 19, 2025
… vespa ingestion of the file fails (#905)

* fix: XYN-288 unlink file from disk after vespa ijection fails

* chore: XYN-288 resolved comments

* Fix spacing in variable declaration for clarity

* Fix assignment operator for successfullyUploadedFiles

* Remove unused Google Cloud Storage import

* Fix formatting of uploadedResult assignment

---------

Co-authored-by: Ravishekhar Yadav <ravishekhar.yadav@Ravishekhar-Yadav-GK4HXKX6DQ.local>
Co-authored-by: Sahil Kumar <119723019+SahilKumar000@users.noreply.github.com>
junaid-shirur pushed a commit that referenced this pull request Sep 19, 2025
* fix: XYN-288 unlink file from disk after transaction of postgress and vespa ingestion of the file fails (#905)

* fix: XYN-288 unlink file from disk after vespa ijection fails

* chore: XYN-288 resolved comments

* Fix spacing in variable declaration for clarity

* Fix assignment operator for successfullyUploadedFiles

* Remove unused Google Cloud Storage import

* Fix formatting of uploadedResult assignment

---------

Co-authored-by: Ravishekhar Yadav <ravishekhar.yadav@Ravishekhar-Yadav-GK4HXKX6DQ.local>
Co-authored-by: Sahil Kumar <119723019+SahilKumar000@users.noreply.github.com>

* fix: XYN-285 pdf view in citation (#881)

resolved issue of viewing pdf file in citation

Co-authored-by: Ravishekhar Yadav <ravishekhar.yadav@Ravishekhar-Yadav-GK4HXKX6DQ.local>

---------

Co-authored-by: Ravishekhar Yadav <ravishekhar.yadav@Ravishekhar-Yadav-GK4HXKX6DQ.local>
Co-authored-by: Sahil Kumar <119723019+SahilKumar000@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants