fix(agentPathSupport): path referred folder considered as integration #1021

naSim087 · 2025-10-01T07:30:22Z

Description

if the path is passed to AgentMessageApi , while doing the vespa call we replace the kb integration with the path referenced folder or file

Testing

Additional Notes

Summary by CodeRabbit

New Features
- Answers now use collection/folder/file context derived from the current path, improving relevance.
- Path-based context applied consistently for both streamed and non-streamed responses.
- Model selection now receives explicit model info for more accurate results.
Bug Fixes
- More resilient handling of missing or invalid path context, gracefully falling back to previous behavior.
- Reduced edge-case inconsistencies when selecting knowledge-base items from navigation paths.

…rce to that agent

coderabbitai · 2025-10-01T07:30:31Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Threaded path-derived collection identifiers through the chat/agent pipeline: added extractItemIdsFromPath, introduced PathExtractedInfo, updated multiple function signatures to accept actualModelId and pathExtractedInfo, and replaced prior folderIds/ids usage with path-derived collectionFileIds/collectionFolderIds/collectionIds where applicable.

Changes

Cohort / File(s)	Summary
Agent flow updates `server/api/chat/agents.ts`	Import `extractItemIdsFromPath`; compute `pathExtractedInfo` when a valid pathRefId is present; derive extractedInfo from message context only; call `extractFileIdsFromMessage(message, email)` without `ids`; pass `actualModelId` and `pathExtractedInfo` into `UnderstandMessageAndAnswer` / `UnderstandMessageAndAnswerForGivenContext` across stream and non-stream flows.
Chat core signatures & sourcing `server/api/chat/chat.ts`	Add exported `PathExtractedInfo` type; add optional `pathExtractedInfo` parameter to generators and helpers (`generateIterativeTimeFilterAndQueryRewrite`, `generatePointQueryTimeExpansion`, `generateMetadataQueryAnswer`, `UnderstandMessageAndAnswer`, `UnderstandMessageAndAnswerForGivenContext`); introduce `getCollectionSource(pathExtractedInfo, selectedItems)` to prefer folder → file → collection IDs and thread source selection through existing flows while preserving defaults.
Utils: path ID extraction `server/api/chat/utils.ts`	New `extractItemIdsFromPath(pathRefId)` that queries `collectionItems` and `collections`, returns `{ collectionFileIds, collectionFolderIds, collectionIds }`, prefixes IDs (`clf-`, `clfd-`, `cl-`), logs errors without throwing, and returns empty arrays for null/failed lookups.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant A as agents.ts
  participant UT as utils.extractItemIdsFromPath
  participant C as chat.ts (Understand...)
  participant KB as Knowledge Base

  U->>A: Send message (+ optional pathRefId)
  A->>UT: extractItemIdsFromPath(pathRefId)
  UT-->>A: pathExtractedInfo {collectionFileIds, collectionFolderIds, collectionIds}
  A->>C: UnderstandMessageAndAnswer(message, email, actualModelId, pathExtractedInfo)
  activate C
  Note over C: source = pathExtractedInfo ? (folder→file→collection) : selectedItems[KB]
  C->>KB: Fetch/process docs from chosen source
  KB-->>C: Context chunks / metadata
  C-->>A: Answer (stream or non-stream)
  A-->>U: Final/streamed response

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

fix(agentApiPathSupport): integrate RAG for path-based source passing #956 — Modifies chat agents to extract and propagate path-derived collection item IDs into UnderstandMessageAndAnswer/UnderstandMessageAndAnswerForGivenContext.
Add Path support in agent chat api via api-key with JSON body #771 — Implements path-based knowledge retrieval and adjusts message/file-ID extraction and agents entry points.

Suggested reviewers

zereraz
shivamashtikar
devesh-juspay
kalpadhwaryu

Poem

I’m a rabbit in the code-lined glade,
I sniff the path where IDs are laid.
I hand the map to models bright,
They fetch the context, shed some light.
Hooray—answers hop into sight! 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title Check	⚠️ Warning	The title, while using a conventional commit prefix, is awkwardly phrased and does not clearly communicate the primary change, which is that when a path is provided to the AgentMessageApi the corresponding folder or file is used in place of the knowledge-base integration reference. Its current wording is confusing and fails to convey the intent or impact of the change to reviewers scanning the history.	Please rephrase the title to clearly and concisely state that the PR makes the AgentMessageApi use the folder or file identified by the provided path as the KB integration reference, for example: “fix(agentPathSupport): use folder or file from path as KB integration reference.”
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/agentPathSupport

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2025-10-01T07:30:42Z

Summary of Changes

Hello @naSim087, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where the AgentMessageApi did not correctly interpret and utilize knowledge base items (files, folders, or collections) when they were referenced via a path. The changes introduce a dedicated mechanism to extract and categorize these path-based references, ensuring they are properly integrated into the RAG (Retrieval Augmented Generation) process. This enhances the accuracy and relevance of AI agent responses when specific knowledge base locations are provided.

Highlights

Path-based ID Extraction: Introduced a new utility function, extractItemIdsFromPath, in server/api/chat/utils.ts. This function is responsible for parsing a given pathRefId and querying the database (collectionItems and collections tables) to determine if it corresponds to a knowledge base file, folder, or an entire collection, returning their respective IDs with appropriate prefixes.
AgentMessageApi Refactoring: The AgentMessageApi in server/api/chat/agents.ts has been refactored to leverage the new extractItemIdsFromPath utility. This change separates the logic for extracting IDs from message content versus those derived from a provided path, ensuring that path-referred items are correctly identified and processed.
Knowledge Base Integration Logic Update: Core chat functions, including generateIterativeTimeFilterAndQueryRewrite, generatePointQueryTimeExpansion, and generateMetadataQueryAnswer in server/api/chat/chat.ts, now accept a pathExtractedInfo parameter. This allows these functions to prioritize and utilize the specific knowledge base items (files, folders, or collections) identified by a path, overriding general message-based item selection when a path is provided.
Type Definition for Path Information: A new type, pathExtractedInfo, has been defined in server/api/chat/chat.ts to standardize the structure for passing extracted file, folder, and collection IDs related to a path throughout the chat processing pipeline.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the AgentMessageApi to support knowledge base paths by introducing a new extractItemIdsFromPath utility and modifying how context is passed to the RAG pipeline. The changes correctly separate path-based context extraction from message-based extraction. My review includes suggestions to improve code quality and maintainability. I've pointed out areas with code duplication that could be refactored into a helper function, inconsistencies in naming conventions and property access that should be standardized, and an opportunity to improve type safety by avoiding the any type. Addressing these points will make the code more robust and easier to maintain.

server/api/chat/chat.ts

server/api/chat/utils.ts

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

server/api/chat/chat.ts (1)

1212-1226: Rename collectionfileIds to collectionFileIds throughout
Update the pathExtractedInfo type to use collectionFileIds (PascalCase) and replace all occurrences of collectionfileIds with collectionFileIds in both server/api/chat/chat.ts and server/api/chat/agents.ts.

🧹 Nitpick comments (2)

server/api/chat/utils.ts (1)

718-752: Add .limit(1) to DB lookups to avoid unnecessary scans

Both selects fetch a single row; limit the query explicitly for performance and clarity.

     const [item] = await db
       .select({ id: collectionItems.id, type: collectionItems.type })
       .from(collectionItems)
       .where(
         and(
           eq(collectionItems.vespaDocId, vespaId),
           isNull(collectionItems.deletedAt),
         ),
       )
+      .limit(1)

     if (item) {
       // ...
     } else {
       // If not found in collectionItems, check collections table
       const [collection] = await db
         .select({ id: collections.id })
         .from(collections)
         .where(
           and(
             eq(collections.vespaDocId, vespaId),
             isNull(collections.deletedAt),
           ),
         )
+        .limit(1)

server/api/chat/agents.ts (1)

3536-3546: Add explicit type annotation to pathExtractedInfo

Import the exported type from utils and annotate to lock in the shape:
-    const pathExtractedInfo = isValidPath
+    // import type { PathExtractedInfo } from "@/api/chat/utils"
+    const pathExtractedInfo: PathExtractedInfo = isValidPath
       ? await extractItemIdsFromPath(ids)
       : { collectionFileIds: [], collectionFolderIds: [], collectionIds: [] }
getRecordBypath returns the Vespa doc ID (string) as expected by extractItemIdsFromPath—no further verification needed.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 096480e and e80edce.

📒 Files selected for processing (3)

server/api/chat/agents.ts (3 hunks)
server/api/chat/chat.ts (11 hunks)
server/api/chat/utils.ts (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (3)

server/api/chat/chat.ts (1)

server/shared/types.ts (1)

Apps (38-38)

server/api/chat/utils.ts (2)

server/db/schema/knowledgeBase.ts (2)

collectionItems (69-140)

collections (20-66)

server/logger/index.ts (2)

getLoggerWithChild (192-200)

Subsystem (15-15)

server/api/chat/agents.ts (2)

server/api/chat/utils.ts (2)

extractFileIdsFromMessage (513-695)

extractItemIdsFromPath (696-767)

server/api/chat/chat.ts (1)

pathExtractedInfo (3832-3836)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (1)

server/api/chat/agents.ts (1)

124-125: Import addition looks good

extractItemIdsFromPath is correctly imported from ./utils and used below.

server/api/chat/agents.ts

server/api/chat/chat.ts

server/api/chat/utils.ts

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e80edce and 301d246.

📒 Files selected for processing (2)

server/api/chat/agents.ts (3 hunks)
server/api/chat/chat.ts (11 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

server/api/chat/chat.ts (1)

server/shared/types.ts (1)

Apps (38-38)

server/api/chat/agents.ts (2)

server/api/chat/utils.ts (2)

extractFileIdsFromMessage (513-695)

extractItemIdsFromPath (696-767)

server/api/chat/chat.ts (1)

pathExtractedInfo (3832-3836)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (3)

server/api/chat/agents.ts (3)

124-124: LGTM!

The import of extractItemIdsFromPath is correctly added and aligns with its usage later in the file.

3536-3545: LGTM!

The refactored extraction logic correctly separates concerns:

extractFileIdsFromMessage handles message-embedded context (pills, links, folders) without path info

extractItemIdsFromPath handles path-derived collection context when a valid path is provided

The object literals use the correct field name collectionFileIds (capital F) matching the type definition.

4306-4307: LGTM!

The parameters passed to UnderstandMessageAndAnswer are correct:

actualModelId provides the model configuration

pathExtractedInfo (constructed earlier with correct field names) enables path-based collection filtering

server/api/chat/chat.ts

server/api/chat/utils.ts

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

server/api/chat/utils.ts (1)

761-764: [Duplicate] Logger argument order still inverted.

This issue was flagged in a previous review but not yet fixed. The logger call passes the message before the error, which breaks the project's convention and can drop stack/metadata.

Apply this diff:
     getLoggerWithChild(Subsystem.Chat)().error(
+      error,
       `Error extracting item IDs from pathRefId: ${vespaId}`,
-      error,
     )

server/api/chat/chat.ts (1)

1349-1376: Path-derived KB IDs are still dropped

When pathExtractedInfo is present we now return its raw IDs from getCollectionSource, but we still immediately run them through the legacy cl-/clfd-/clf- prefix parsing. IDs coming from extractItemIdsFromPath do not carry those prefixes, so every branch falls through and we end up pushing zero IDs into agentSpecificCollectionSelections, meaning path-based folder/file selections remain broken. The same regression exists in the duplicated blocks at Line 2607-2634 and Line 3171-3198.

We need to short-circuit the prefix parsing whenever pathExtractedInfo is populated and push its arrays straight into collectionIds / collectionFolderIds / collectionFileIds instead.

         const collectionIds: string[] = []
         const collectionFolderIds: string[] = []
         const collectionFileIds: string[] = []
-        const source = getCollectionSource(pathExtractedInfo, selectedItems)
-
-        for (const itemId of source) {
-          if (itemId.startsWith("cl-")) {
-            // Entire collection - remove cl- prefix
-            collectionIds.push(itemId.replace(/^cl[-_]/, ""))
-          } else if (itemId.startsWith("clfd-")) {
-            // Collection folder - remove clfd- prefix
-            collectionFolderIds.push(itemId.replace(/^clfd[-_]/, ""))
-          } else if (itemId.startsWith("clf-")) {
-            // Collection file - remove clf- prefix
-            collectionFileIds.push(itemId.replace(/^clf[-_]/, ""))
-          }
-        }
+        const hasPathCollections =
+          pathExtractedInfo &&
+          (pathExtractedInfo.collectionFolderIds.length ||
+            pathExtractedInfo.collectionFileIds.length ||
+            pathExtractedInfo.collectionIds.length)
+
+        if (hasPathCollections) {
+          collectionIds.push(...pathExtractedInfo!.collectionIds)
+          collectionFolderIds.push(...pathExtractedInfo!.collectionFolderIds)
+          collectionFileIds.push(...pathExtractedInfo!.collectionFileIds)
+        } else {
+          const source = selectedItems[Apps.KnowledgeBase] || []
+          for (const itemId of source) {
+            if (itemId.startsWith("cl-")) {
+              collectionIds.push(itemId.replace(/^cl[-_]/, ""))
+            } else if (itemId.startsWith("clfd-")) {
+              collectionFolderIds.push(itemId.replace(/^clfd[-_]/, ""))
+            } else if (itemId.startsWith("clf-")) {
+              collectionFileIds.push(itemId.replace(/^clf[-_]/, ""))
+            }
+          }
+        }

Apply the same fix in the corresponding blocks inside generatePointQueryTimeExpansion and generateMetadataQueryAnswer so path-based selections flow all the way through.

🧹 Nitpick comments (1)

server/api/chat/utils.ts (1)
758-758: Empty else block—clarify intent or remove.

Line 758 contains an empty else block. If no action is needed for unrecognized prefixes, consider adding a comment explaining why (e.g., "silently ignore unrecognized prefixes") or remove the block entirely for clarity.

Apply this diff if no action is intended:
       if (collection) {
         collectionIds.push(`cl-${collection.id}`) // Keep the original prefixed ID
       }
-    } else {
     }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 301d246 and fba2af2.

📒 Files selected for processing (3)

server/api/chat/agents.ts (3 hunks)
server/api/chat/chat.ts (11 hunks)
server/api/chat/utils.ts (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (3)

server/api/chat/utils.ts (2)

server/db/schema/knowledgeBase.ts (2)

collectionItems (69-140)

collections (20-66)

server/logger/index.ts (2)

getLoggerWithChild (192-200)

Subsystem (15-15)

server/api/chat/agents.ts (1)

server/api/chat/utils.ts (2)

extractFileIdsFromMessage (513-695)

extractItemIdsFromPath (696-773)

server/api/chat/chat.ts (1)

server/shared/types.ts (1)

Apps (38-38)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (3)

server/api/chat/agents.ts (2)

3536-3545: Path extraction logic looks correct.

The extraction flow properly separates message-based context (extractedInfo) from path-based context (pathExtractedInfo). The conditional extraction based on isMsgWithContext and isValidPath ensures that each type of context is only extracted when applicable, with appropriate empty defaults when not.

4295-4308: UnderstandMessageAndAnswer signature and PathExtractedInfo casing are correct
The function in server/api/chat/chat.ts accepts modelId? followed by pathExtractedInfo?: PathExtractedInfo, and the PathExtractedInfo type defines collectionFileIds, collectionFolderIds, and collectionIds with proper casing.

server/api/chat/utils.ts (1)

720-757: Verify ID prefixing in utils.ts matches downstream search filters
Path‐extracted IDs are built as clf-<item.id>, clfd-<item.id>, cl-<item.id>, then later stripped back to raw UUIDs before calling the Vespa query layer. Confirm whether Vespa expects:

Vespa document IDs (original vespaDocId, e.g. clf-12345…), in which case use item.vespaDocId when prefixing in utils.ts, or

Raw Postgres UUIDs, in which case drop the clf-/clfd-/cl- prefixes in utils.ts to avoid passing invalid identifiers.

fix(agentPathSupport): path referenced folder are now consider as sou…

e80edce

…rce to that agent

naSim087 requested review from devesh-juspay, junaid-shirur, kalpadhwaryu, shivamashtikar and zereraz as code owners October 1, 2025 07:30

gemini-code-assist bot reviewed Oct 1, 2025

View reviewed changes

server/api/chat/chat.ts Outdated Show resolved Hide resolved

server/api/chat/chat.ts Outdated Show resolved Hide resolved

server/api/chat/chat.ts Outdated Show resolved Hide resolved

server/api/chat/utils.ts Outdated Show resolved Hide resolved

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

fix(agentPathSupport): resolved comments

301d246

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

server/api/chat/chat.ts Outdated Show resolved Hide resolved

junaid-shirur reviewed Oct 1, 2025

View reviewed changes

server/api/chat/chat.ts Outdated Show resolved Hide resolved

server/api/chat/chat.ts Outdated Show resolved Hide resolved

server/api/chat/utils.ts Outdated Show resolved Hide resolved

server/api/chat/utils.ts Show resolved Hide resolved

naSim087 added 2 commits October 1, 2025 15:03

fix(agentPathSupport): resolved comments

29cb6e1

fix(agentPathSupport): resolved comments

fba2af2

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

junaid-shirur approved these changes Oct 1, 2025

View reviewed changes

junaid-shirur merged commit cc74086 into main Oct 1, 2025
3 of 4 checks passed

junaid-shirur deleted the fix/agentPathSupport branch October 1, 2025 10:00

coderabbitai bot mentioned this pull request Oct 16, 2025

Files extract2 #1075

Open

fix(agentPathSupport): path referred folder considered as integration #1021

fix(agentPathSupport): path referred folder considered as integration #1021

Uh oh!

Conversation

naSim087 commented Oct 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Additional Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

gemini-code-assist bot commented Oct 1, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

naSim087 commented Oct 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 1, 2025 •

edited

Loading