feat(chats): Enable web search via Vertex Gemini model #751

junaid-shirur · 2025-09-01T11:31:25Z

Description

Integrated web search support using Vertex Gemini model
globe icon toggle for enabling/disabling web search

Testing

Additional Notes

Summary by CodeRabbit

New Features
- Optional Web Search in chat: toggleable live web sources and integrated bracketed citations surfaced during and after streaming.
Improvements
- Streaming now progressively shows sources and grounding data with citation updates.
- Chat rendering and virtualization improved for smoother streaming, better auto-scroll, and follow-up placement.
- New icons/labels for Web Search, PDFs, and Events.
Refactor
- Provider layer added multi-backend support (Anthropic & Google).
Chores
- Added Vertex AI client dependency.

coderabbitai · 2025-09-01T11:31:34Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds optional Web Search across frontend and server: an enableWebSearch flag flows through routes, hooks, and APIs; AI providers (Gemini/Vertex) and types support web-search tooling, sources, and grounding; chat streaming and citation SSEs surface web sources; search types/icons/UI updated; vespa file-ID queries extended.

Changes

Cohort / File(s)	Summary
Frontend: streaming flag propagation `frontend/src/hooks/useChatStream.ts`	Adds `enableWebSearch` parameter to `useChatStream` and `startStream`, appends `enableWebSearch` to the message-create URL and forwards the flag into streaming.
Frontend: chat routes and virtualization `frontend/src/routes/_authenticated/chat.tsx`, `frontend/src/routes/_authenticated/index.tsx`	Adds `enableWebSearch` and `agentId` route/query params; robust `toolsList` parsing; forwards `enableWebSearch` into chat start and `startStream`; refactors `VirtualizedMessages` virtualization, scroll, and rendering.
Frontend: UI labels & icons `frontend/src/components/GroupFilter.tsx`, `frontend/src/lib/common.tsx`	Adds Web Search name mapping and globe icon; introduces PDF/Event icons, Drive icon tweaks, and imports/wires `WebSearchEntity`.
Server: AI types and providers (core) `server/ai/types.ts`, `server/ai/provider/gemini.ts`, `server/ai/provider/vertex_ai.ts`, `server/ai/provider/index.ts`	Adds `webSearch?: boolean` to `ModelParams`; extends `ConverseResponse` with `sources` and `groundingSupports`; Gemini and Vertex support web-search tooling and grounding; VertexAiProvider gains dual-provider support (`VertexProvider`) and a `webSearchQuestion` API.
Server: chat/search APIs & streaming `server/api/chat/chat.ts`, `server/api/search.ts`, `server/api/chat/utils.ts`	Adds `enableWebSearch` to request schemas and message handling; introduces web-search streaming branch that accumulates sources/grounding, emits SSE citation updates, processes and materializes citations into final answers; adds citation insertion helper.
Search domain types & exports `server/search/types.ts`, `server/shared/types.ts`	Adds `Apps.WebSearch`, `WebSearchEntity` enum and schema; exports `WebSearchEntity` via shared types; extends entity unions and validation.
Search engine integration `server/search/vespa.ts`	Extends `HybridDefaultProfileForAgent` to accept `collectionFileIds`, resolves file-level Vespa document IDs, and includes them in YQL where clauses.
Package deps `server/package.json`	Adds `@google-cloud/vertexai` dependency.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Frontend as Frontend Chat UI
  participant Router as Chat Route
  participant Hook as useChatStream
  participant API as /api/v1/message/create
  participant Server as server/api/chat
  participant Provider as AI Provider (Gemini/Vertex)

  User->>Frontend: Submit question (enableWebSearch?)
  Frontend->>Router: Navigate with query params (enableWebSearch)
  Router->>Hook: startStream(..., enableWebSearch)
  Hook->>API: SSE request ?enableWebSearch=true|false
  API->>Server: route handling (enableWebSearch)
  Server->>Provider: converseStream({ webSearch: flag })
  alt Web Search enabled
    Provider->>Provider: enable web tools + grounding
    Provider-->>Server: chunks + sources + groundingSupports
    Server->>Frontend: SSE chunks + CitationsUpdate events
  else Disabled
    Provider-->>Server: chunks (text only)
    Server-->>Frontend: SSE chunks
  end
  Frontend-->>User: Render response (+ citations)

sequenceDiagram
  autonumber
  participant Index as server/ai/provider/index.ts
  participant Vertex as VertexAiProvider
  note over Index,Vertex: Dynamic Vertex backend selection
  Index->>Vertex: new VertexAiProvider({projectId,region,provider})
  alt provider = GOOGLE
    Vertex->>Vertex: instantiate Google VertexAI client
    Vertex->>Vertex: use Google-specific converse/converseStream (webSearch tooling)
  else provider = ANTHROPIC
    Vertex->>Vertex: instantiate AnthropicVertex client
    Vertex->>Vertex: use Anthropic-specific converse/converseStream
  end
  Index->>Vertex: webSearchQuestion(..., webSearch=true)
  Vertex-->>Index: stream text + sources + groundingSupports

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix(streaming): multi chat parallel streaming #543 — Modifies useChatStream / startStream signatures and streaming flag propagation.
feat(ai-provider): Vertex setup #732 — Overlaps Vertex AI provider changes and dynamic backend/provider selection.
fix(gemini): Fix gemini streaming issues #670 — Related to Gemini streaming/grounding handling and response chunking.

Suggested reviewers

zereraz
shivamashtikar
kalpadhwaryu

Poem

I twitch my whiskers, comb the net, 🌐
I hop through links and footnote set.
I fetch the sources, tuck each cite,
I stream the facts into the night.
Thump — a rabbit brought the truth to light. 🐇✨

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/web_search

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

gemini-code-assist

Summary of Changes

Hello @junaid-shirur, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the chat functionality by integrating real-time web search capabilities. The primary goal is to improve the accuracy and recency of AI responses by allowing the model to access up-to-date information from the web. The changes involve updates to how chat messages are processed, how AI models are configured, and how search results are presented to the user.

Highlights

Web Search Integration: Introduced web search capabilities leveraging the Vertex Gemini model, allowing the AI to fetch real-time information to answer user queries. This includes adding an enableWebSearch parameter across various frontend and backend functions and API calls.
Frontend UI Enhancements: Updated the chat interface to support web search. This involves adding a new WebSearchEntity and Apps.WebSearch for proper display of web search sources and a dedicated globe icon to represent web search results.
Backend AI Provider Updates: Modified GeminiAIProvider and VertexAiProvider to incorporate googleSearch tools within their model configurations when web search is enabled. The VertexAiProvider now supports both Anthropic and Google Vertex backends, with specific implementations for each.
Advanced Citation Handling: Implemented new logic in the backend to process and integrate web search citations and grounding supports from Gemini/Vertex AI responses directly into the chat UI, providing users with verifiable sources for AI-generated answers.
Type Definition Extensions: Extended core type definitions (ModelParams, ConverseResponse) to include new properties like webSearch, sources, and groundingSupports, ensuring robust data handling for the new web search feature.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a web search feature powered by the Vertex Gemini model. The changes span both the frontend and backend, adding a toggle for web search in the UI and routing requests to the appropriate model on the backend. The VertexAiProvider has been significantly refactored to support multiple underlying providers (Anthropic and Google), which is a great improvement for flexibility. My feedback includes a critical fix for a logic issue in the frontend routing parameters, along with several suggestions to improve type safety and performance on the backend.

frontend/src/routes/_authenticated/chat.tsx

server/ai/provider/gemini.ts

server/ai/provider/index.ts

server/api/chat/chat.ts

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)

server/search/vespa.ts (1)

846-899: Escape YQL inputs and dedupe IDs to prevent injection and reduce bloat

collectionIds/collectionFolderIds/collectionFileIds are interpolated directly into YQL without escaping; this allows quote-breaking and potential YQL injection if any ID contains '. Use the existing escapeYqlValue helper.
Also dedupe/trim IDs to avoid redundant OR clauses.

Apply:

-  const buildCollectionFileYQL = async () => {
+  const buildCollectionFileYQL = async () => {
     // Extract all IDs from the key-value pairs
-    const collectionIds: string[] = []
-    const collectionFolderIds: string[] = []
-    const collectionFileIds: string[] = []
+    const collectionIds: string[] = []
+    const collectionFolderIds: string[] = []
+    const collectionFileIds: string[] = []

     for (const selection of collectionSelections) {
       if (selection.collectionIds) {
         collectionIds.push(...selection.collectionIds)
       }
       if (selection.collectionFolderIds) {
         collectionFolderIds.push(...selection.collectionFolderIds)
       }
       if (selection.collectionFileIds) {
         collectionFileIds.push(...selection.collectionFileIds)
       }
     }
-    let conditions: string[] = []
+    // Normalize and dedupe
+    const uniq = (arr: string[]) =>
+      Array.from(new Set(arr.filter(Boolean).map((s) => s.trim())))
+    const normCollectionIds = uniq(collectionIds)
+    const normFolderIds = uniq(collectionFolderIds)
+    const normFileIds = uniq(collectionFileIds)
+    const conditions: string[] = []

     // Handle entire collections - use clId filter (efficient)
-    if (collectionIds.length > 0) {
-      const collectionCondition = `(${collectionIds.map((id: string) => `clId contains '${id.trim()}'`).join(" or ")})`
+    if (normCollectionIds.length > 0) {
+      const collectionCondition = `(${normCollectionIds
+        .map((id: string) => `clId contains '${escapeYqlValue(id)}'`)
+        .join(" or ")})`
       conditions.push(collectionCondition)
     }

     // Handle specific folders - need to get file IDs (less efficient but necessary)
-    if (collectionFolderIds.length > 0) {
-      const clFileIds = await getAllFolderItems(collectionFolderIds, db)
+    if (normFolderIds.length > 0) {
+      const clFileIds = await getAllFolderItems(normFolderIds, db)
       if (clFileIds.length > 0) {
         const ids = await getCollectionFilesVespaIds(clFileIds, db)
-        const clVespaIds = ids
-          .filter((item: any) => item.vespaDocId !== null)
-          .map((item: any) => item.vespaDocId!)
+        const clVespaIds = Array.from(
+          new Set(
+            ids
+              .filter((item: any) => item.vespaDocId != null)
+              .map((item: any) => String(item.vespaDocId))
+          )
+        )

         if (clVespaIds.length > 0) {
-          const folderCondition = `(${clVespaIds.map((id: string) => `docId contains '${id.trim()}'`).join(" or ")})`
+          const folderCondition = `(${clVespaIds
+            .map((id: string) => `docId contains '${escapeYqlValue(id)}'`)
+            .join(" or ")})`
           conditions.push(folderCondition)
         }
       }
     }

     // Handle specific files - use file IDs directly (most efficient for individual files)
-    if (collectionFileIds.length > 0) {
-      const ids = await getCollectionFilesVespaIds(collectionFileIds, db)
-      const clVespaIds = ids
-        .filter((item: any) => item.vespaDocId !== null)
-        .map((item: any) => item.vespaDocId!)
+    if (normFileIds.length > 0) {
+      const ids = await getCollectionFilesVespaIds(normFileIds, db)
+      const clVespaIds = Array.from(
+        new Set(
+          ids
+            .filter((item: any) => item.vespaDocId != null)
+            .map((item: any) => String(item.vespaDocId))
+        )
+      )

       if (clVespaIds.length > 0) {
-        const fileCondition = `(${clVespaIds.map((id: string) => `docId contains '${id.trim()}'`).join(" or ")})`
+        const fileCondition = `(${clVespaIds
+          .map((id: string) => `docId contains '${escapeYqlValue(id)}'`)
+          .join(" or ")})`
         conditions.push(fileCondition)
       }
     }

Optional follow-ups:

If selections can be very large, chunk OR clauses (e.g., 500 IDs per group) to avoid query-size limits.
Consider try/catch around DB lookups to degrade gracefully (log and skip this subquery) instead of failing the whole agent search.

frontend/src/routes/_authenticated/index.tsx (1)

223-241: Fix enableWebSearch propagation, schema, and ChatBox support
In frontend/src/routes/_authenticated/index.tsx (around lines 223–241, 247–253, 267–270), change
- if (enableWebSearch) {
-   searchParams.enableWebSearch = enableWebSearch
- }
+ if (enableWebSearch !== undefined) {
+   searchParams.enableWebSearch = enableWebSearch
+ }
so false values are included.
In frontend/src/routes/_authenticated/chat.tsx (lines 2793–2801), replace the string‐based schema
enableWebSearch: z
  .string()
  .transform(val => val === "false")
  .optional()
  .default("false")
with a boolean schema, e.g.:
enableWebSearch: z.boolean().optional().default(false)
In frontend/src/components/ChatBox.tsx (props at ~lines 132–138 and call at ~1729–1732), extend handleSend to include the enableWebSearch parameter and forward it in handleSendMessage().

frontend/src/routes/_authenticated/chat.tsx (1)

2518-2526: Sanitize after JSON-to-HTML conversion to avoid breaking JSON.

Currently you sanitize the JSON string before JSON.parse, which can corrupt JSON payloads. Build the HTML first, then sanitize that HTML.
-              __html: jsonToHtmlMessage(DOMPurify.sanitize(message)),
+              __html: DOMPurify.sanitize(jsonToHtmlMessage(message)),

server/ai/provider/vertex_ai.ts (2)

29-66: Image part shape is Anthropic-style; Google Vertex AI expects inlineData parts.

buildVertexAIImageParts() returns { type: "image", source: { ... } }, which fits Anthropic but not Vertex AI. In the Google paths you pass these parts to sendMessage/sendMessageStream, which expect Vertex AI “Part” objects, e.g. { inlineData: { mimeType, data } }. This will break image-grounded prompts in the Google provider.

Split builders: one for Anthropic (current shape), one for Google Vertex AI.

Use the correct builder in each path.
-const buildVertexAIImageParts = async (imagePaths: string[]) => {
+const buildAnthropicImageParts = async (imagePaths: string[]) => {
   // ... current implementation unchanged ...
 }
+
+const buildGoogleVertexImageParts = async (imagePaths: string[]) => {
+  const baseDir = path.resolve(process.env.IMAGE_DIR || "downloads/xyne_images_db")
+  const imagePromises = imagePaths.map(async (imgPath) => {
+    const match = imgPath.match(/^(.+)_([0-9]+)$/)
+    if (!match) throw new Error(`Invalid image path: ${imgPath}`)
+    const docId = match[1]
+    const imageDir = path.join(baseDir, docId)
+    const absolutePath = findImageByName(imageDir, match[2])
+    const ext = path.extname(absolutePath).toLowerCase()
+    const mimeMap: Record<string, string> = { ".png": "image/png", ".jpg": "image/jpeg", ".jpeg": "image/jpeg", ".webp": "image/webp" }
+    const mimeType = mimeMap[ext]
+    if (!mimeType) return null
+    try {
+      await fs.promises.access(absolutePath, fs.constants.F_OK)
+      const imgBuffer = await fs.promises.readFile(absolutePath)
+      if (imgBuffer.length > MAX_IMAGE_SIZE_BYTES) return null
+      const base64 = imgBuffer.toString("base64")
+      return { inlineData: { mimeType, data: base64 } }
+    } catch (err) {
+      Logger.error(`Failed to read image: ${absolutePath}`)
+      return null
+    }
+  })
+  const results = await Promise.all(imagePromises)
+  return results.filter(Boolean)
+}
Use the appropriate builder:
-// Anthropic paths
-const imageParts = params.imageFileNames?.length ? await buildVertexAIImageParts(params.imageFileNames) : []
+const imageParts = params.imageFileNames?.length ? await buildAnthropicImageParts(params.imageFileNames) : []

-// Google paths
-const imageParts = params.imageFileNames?.length ? await buildVertexAIImageParts(params.imageFileNames) : []
+const imageParts = params.imageFileNames?.length ? await buildGoogleVertexImageParts(params.imageFileNames) : []
Also ensure createLabeledImageContent can accept Vertex-style parts (it currently pushes parts verbatim).

Also applies to: 209-312, 313-455

121-124: Align imageFileNames format between producers and consumers.

extractImageFileNames and createLabeledImageContent generate/expect docIndex_docId_imageNumber, but buildVertexAIImageParts (and other build*ImageParts) only parse docId_imageNumber via /^(.+)_([0-9]+)$/. Update those loaders to use /^(\d+)_(.+)_(\d+)$/, then set docId = match[2] and imageNumber = match[3] (or strip the leading docIndex_ before parsing) so images load and label correctly.

🧹 Nitpick comments (16)

server/api/search.ts (1)

169-175: Unify boolean parsing and verify retry parity

Parsing looks fine. To avoid drift with other flags (e.g., isReasoningEnabled), factor a shared boolean parser and reuse it here. Also, if retries should preserve web search behavior, mirror this flag in messageRetrySchema.

Apply locally within this file:

+const booleanString = z
+  .string()
+  .optional()
+  .transform((val) => !!val && val.toLowerCase() === "true")

 export const messageSchema = z.object({
   ...
-  enableWebSearch: z
-    .string()
-    .optional()
-    .transform((val) => {
-      if (!val) return false
-      return val.toLowerCase() === "true"
-    }),
+  enableWebSearch: booleanString,
   ...
 })

If retries must honor the same switch (optional—confirm server behavior):

 export const messageRetrySchema = z.object({
   messageId: z.string().min(1),
   agentId: z.string().optional(),
   agentic: z.string().optional().default("false"),
   isReasoningEnabled: z
     .string()
     .optional()
     .transform((val) => {
       if (!val) return false
       return val.toLowerCase() === "true"
     }),
+  enableWebSearch: booleanString.optional(),
 })

frontend/src/hooks/useChatStream.ts (2)

181-181: Reasoning text growth nit

Appending “\n” per step improves readability but can balloon memory for long runs. Consider bounding length or storing steps as an array and joining on render.

283-284: Only send enableWebSearch when true

Saves bytes and keeps defaults server-side.
-url.searchParams.append("enableWebSearch", enableWebSearch.toString())
+if (enableWebSearch) {
+  url.searchParams.append("enableWebSearch", "true")
+}

server/ai/types.ts (2)

83-84: Name consistency: webSearch vs enableWebSearch

Request/URL uses enableWebSearch; ModelParams uses webSearch. Align names or document the mapping at the API boundary to reduce confusion.

95-109: WebSearchSource/GroundingSupport: clarify index semantics

Consider documenting whether segment indices are UTF-16 code units and whether endIndex is exclusive. If client Citation uses url, either map uri → url server-side or rename for consistency.

server/ai/provider/index.ts (1)

281-296: Env-driven Vertex backend init: log enum label and guard invalid env value

Logging ${provider} prints the numeric enum; log the label for readability.
If VERTEX_PROVIDER is set to an unknown value, you silently fall back to ANTHROPIC. Log a warning to aid ops.

-    const provider =
-      vertexProviderType && VertexProvider[vertexProviderType]
-        ? VertexProvider[vertexProviderType]
-        : VertexProvider.ANTHROPIC
+    const provider =
+      vertexProviderType && VertexProvider[vertexProviderType]
+        ? VertexProvider[vertexProviderType]
+        : VertexProvider.ANTHROPIC
+    if (!VertexProvider[vertexProviderType]) {
+      Logger.warn(
+        `VERTEX_PROVIDER='${vertexProviderType}' not recognized. Falling back to 'ANTHROPIC'.`,
+      )
+    }
...
-    Logger.info(`Initialized VertexAI provider with ${provider} backend`)
+    Logger.info(
+      `Initialized VertexAI provider with ${
+        VertexProvider[provider]
+      } backend`,
+    )

server/ai/provider/gemini.ts (2)

135-141: Type the tools array to avoid implicit any and future regressions

Make the tools’ type explicit to satisfy TS and make schema changes obvious.
-      const tools = []
+      const tools: Array<{ googleSearch: Record<string, never> }> = []
...
-          // Add tools configuration for web search
-          tools: tools.length > 0 ? tools : undefined,
+          tools: tools.length > 0 ? tools : undefined,
Apply the same typing in converseStream.
-      const tools = []
+      const tools: Array<{ googleSearch: Record<string, never> }> = []
Also applies to: 148-150, 249-255, 262-262

231-396: Consider emitting token usage in metadata for cost/telemetry

Downstream accumulates token usage from chunk.metadata.usage; Gemini code doesn’t attach usage, so UI/metrics miss tokens.
-      const text = response.text
+      const text = response.text
+      // If available from SDK in future, attach usage:
+      // const usage = response.usage ?? undefined
+      // const metadata = usage ? { usage } : undefined
...
-      return { text, cost, sources, groundingSupports }
+      return { text, cost, sources, groundingSupports /*, metadata*/ }

server/api/chat/chat.ts (3)

4647-4655: Token metrics: support both usage shapes

Vertex provider emits metadata.inputTokens/outputTokens (not metadata.usage). Capture both to keep analytics intact.

-                if (chunk.metadata?.usage) {
-                  tokenArr.push({
-                    inputTokens: chunk.metadata.usage.inputTokens || 0,
-                    outputTokens: chunk.metadata.usage.outputTokens || 0,
-                  })
-                }
+                const usage =
+                  chunk.metadata?.usage ??
+                  (chunk.metadata &&
+                  typeof chunk.metadata.inputTokens === "number" &&
+                  typeof chunk.metadata.outputTokens === "number"
+                    ? {
+                        inputTokens: chunk.metadata.inputTokens,
+                        outputTokens: chunk.metadata.outputTokens,
+                      }
+                    : null)
+                if (usage) tokenArr.push(usage)

4660-4686: Stream the citation brackets delta so UI matches stored answer

You append “[n]” post-stream but never send the delta to the client. Emit one final ResponseUpdate with only the appended part to keep UI and DB in sync.

-              if (citationResult) {
-                answer = citationResult.updatedAnswer
+              if (citationResult) {
+                const prev = answer
+                answer = citationResult.updatedAnswer
                 sourceIndex = citationResult.updatedSourceIndex
...
-                if (citationResult.newCitations.length > 0) {
+                if (citationResult.newCitations.length > 0) {
                   citations.push(...citationResult.newCitations)
                   Object.assign(citationMap, citationResult.newCitationMap)
 
                   stream.writeSSE({
                     event: ChatSSEvents.CitationsUpdate,
                     data: JSON.stringify({
                       contextChunks: citations,
                       citationMap: citationMap,
                     }),
                   })
+                  // Send the brackets delta appended to the text
+                  const delta = answer.slice(prev.length)
+                  if (delta) {
+                    await stream.writeSSE({
+                      event: ChatSSEvents.ResponseUpdate,
+                      data: delta,
+                    })
+                  }
                 }
               }

223-223: Type coherence for GroundingSupport

You import GroundingSupport from @google/genai here, but providers surface ai/types.GroundingSupport. Prefer a single internal type (ai/types) throughout to avoid drift.

Would you like me to convert chat.ts to use the internal type and add a narrow adapter at provider boundaries?

Also applies to: 3866-3936

frontend/src/routes/_authenticated/chat.tsx (2)

721-733: Clear enableWebSearch from URL after processing deep-link.

You’re clearing other params but not enableWebSearch; it sticks in the address bar and re-triggers on refresh.

       router.navigate({
         to: "/chat",
         search: (prev) => ({
           ...prev,
           q: undefined,
           reasoning: undefined,
           sources: undefined,
           agentId: undefined, // Clear agentId from URL after processing
           toolsList: undefined, // Clear toolsList from URL after processing
           metadata: undefined, // Clear metadata from URL after processing
+          enableWebSearch: undefined, // Clear web search toggle
         }),
         replace: true,
       })

2193-2216: Auto-scroll toggling misses “scrolled down but not bottom” case.

You only set userHasScrolled when scrolling up. If a user scrolls down slightly and pauses above bottom, new tokens will yank them to bottom. Use “not at bottom” as the criterion.

-        if (isAtBottom) {
-          // User is at bottom, allow auto-scroll
-          setUserHasScrolled(false)
-        } else if (scrollTop < lastScrollTop.current) {
-          // User scrolled up, disable auto-scroll
-          setUserHasScrolled(true)
-        }
+        // Disable auto-scroll whenever user is not at bottom; re-enable at bottom
+        setUserHasScrolled(!isAtBottom)

server/ai/provider/vertex_ai.ts (3)

221-279: Non-image code blocks are fine, but image-less branch assumes all blocks have text.

allBlocks.map((block) => ({ text: block.text })) will emit { text: undefined } for non-text blocks. Guard for "text" in block.

-      } else {
-        // otherwise just pass along the raw blocks
-        messageParts = allBlocks.map((block) => ({ text: block.text }))
-      }
+      } else {
+        // otherwise just pass along the raw blocks (text-only)
+        messageParts = allBlocks.filter((b: any) => "text" in b).map((b: any) => ({ text: b.text }))
+      }

Also applies to: 368-385

471-474: Fix stray quote and spacing in injected Anthropic text.

There’s an extra " and missing space: "image(s)as". Clean it up.

-              text: `You may receive image(s)as part of the conversation. If images are attached, treat them as essential context for the user's question.\n\n"
-              ${userText}`,
+              text: `You may receive image(s) as part of the conversation. If images are attached, treat them as essential context for the user's question.\n\n${userText}`,

293-307: Streaming: consider emitting a final summary chunk with metadata.

Google streaming never yields a final empty chunk with cost/usage like the Anthropic path. For consistency, emit a final chunk (even with empty text) including metadata/cost.

       for await (const chunk of result.stream) {
         // ... existing per-chunk yields ...
       }
+      // Final yield to surface accumulated sources/supports without text
+      yield {
+        text: "",
+        cost: 0,
+        sources: accumulatedSources.length ? accumulatedSources : undefined,
+        groundingSupports: accumulatedGroundingSupports.length ? accumulatedGroundingSupports : undefined,
+        metadata: { model: modelParams.modelId, responseTime: Date.now() },
+      }

Also applies to: 391-449

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 53cb1fd and d2e44d4.

📒 Files selected for processing (14)

frontend/src/components/GroupFilter.tsx (2 hunks)
frontend/src/hooks/useChatStream.ts (5 hunks)
frontend/src/lib/common.tsx (3 hunks)
frontend/src/routes/_authenticated/chat.tsx (7 hunks)
frontend/src/routes/_authenticated/index.tsx (5 hunks)
server/ai/provider/gemini.ts (9 hunks)
server/ai/provider/index.ts (3 hunks)
server/ai/provider/vertex_ai.ts (6 hunks)
server/ai/types.ts (1 hunks)
server/api/chat/chat.ts (8 hunks)
server/api/search.ts (1 hunks)
server/search/types.ts (6 hunks)
server/search/vespa.ts (2 hunks)
server/shared/types.ts (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (9)

server/search/types.ts (1)

server/shared/types.ts (3)

WebSearchEntity (36-36)

SystemEntity (33-33)

DataSourceEntity (35-35)

server/ai/provider/gemini.ts (1)

server/ai/types.ts (2)

WebSearchSource (95-99)

GroundingSupport (101-108)

server/ai/provider/index.ts (3)

server/ai/provider/vertex_ai.ts (1)

VertexAiProvider (67-485)

server/ai/types.ts (2)

ModelParams (69-84)

ConverseResponse (86-93)

server/ai/agentPrompts.ts (1)

agentSearchQueryPrompt (1026-1346)

frontend/src/lib/common.tsx (1)

server/shared/types.ts (3)

Apps (31-31)

SystemEntity (33-33)

WebSearchEntity (36-36)

frontend/src/components/GroupFilter.tsx (1)

server/shared/types.ts (2)

Apps (31-31)

WebSearchEntity (36-36)

frontend/src/routes/_authenticated/chat.tsx (2)

frontend/src/types.ts (1)

ToolsListItem (58-61)

frontend/src/components/FollowUpQuestions.tsx (1)

FollowUpQuestions (12-147)

server/ai/provider/vertex_ai.ts (2)

server/ai/types.ts (3)

ModelParams (69-84)

ConverseResponse (86-93)

WebSearchSource (95-99)

server/ai/utils.ts (1)

createLabeledImageContent (1-33)

server/shared/types.ts (1)

server/search/types.ts (2)

KbItemsSchema (25-25)

scoredChunk (244-248)

server/api/chat/chat.ts (2)

server/ai/provider/index.ts (3)

webSearchQuestion (1793-1834)

generateSearchQueryOrAnswerFromConversation (1457-1495)

jsonParseLLMOutput (501-628)

server/ai/types.ts (2)

WebSearchSource (95-99)

GroundingSupport (101-108)

🪛 GitHub Actions: TypeScript Build Check

server/ai/provider/vertex_ai.ts

[error] 6-6: bunx tsc -b failed with TS2307: Cannot find module '@google-cloud/vertexai' or its corresponding type declarations.

🔇 Additional comments (27)

frontend/src/hooks/useChatStream.ts (3)

238-239: startStream API extended with enableWebSearch

Signature change looks good and matches downstream usage.

627-628: Hook surface extended

Exposing enableWebSearch here is consistent with startStream. LGTM.

643-644: Confirm retry handling of enableWebSearch

The /api/v1/message/retry path and its messageRetrySchema don’t accept or forward enableWebSearch, so retried messages won’t run web searches. Either add enableWebSearch to the retry URL in useChatStream.ts and to messageRetrySchema, or confirm that web search should remain disabled on retries.

server/ai/types.ts (1)

91-93: New response fields: ensure SSE wiring to UI types

sources and groundingSupports are added here. Verify server-side SSE maps these to front-end Citation/ImageCitation expectations and that citation indices align with rendered text.

server/search/types.ts (7)

63-63: Apps.WebSearch addition

Enum extension looks correct.

92-96: Entity validation updated

Including WebSearchEntity in isValidEntity is correct.

176-180: KB entity comment-only tweaks

No functional change. Fine.

210-213: Introduce WebSearchEntity

Good separation for web search.

216-216: Schema export added

Looks good.

227-227: entitySchema union updated

Correctly included.

240-240: Entity type union updated

Type surface aligned.

frontend/src/components/GroupFilter.tsx (1)

116-118: Add label for Web Search — looks good.

Mapping Apps.WebSearch + WebSearchEntity.WebSearch to "Web Search" integrates cleanly with getName().

server/shared/types.ts (5)

36-37: Re-exporting WebSearchEntity from shared/types is correct.

This unblocks frontend imports via "shared/types".

265-286: KB file response schema consolidation — validate downstream assumptions.

Combining extends and adding chunk/chunkIndex/chunks_summary/relevance/matchfeatures/rankfeatures is fine. Please confirm consumers (API responses, UI mappers) handle these optional fields and the literal type: KbItemsSchema.

482-482: Minor literal style change is benign.

status union using double-quoted literals is a no-op for TS. All good.

505-516: Interface line-wrap changes only.

No functional/type surface changes detected.

530-546: Interface additions retain compatibility.

These keep the enhanced reasoning types coherent. No issues.

frontend/src/routes/_authenticated/index.tsx (1)

155-157: Persisting agentId from URL — OK.

Keeps agent selection stable across refresh/navigation.

frontend/src/lib/common.tsx (1)

145-146: Web Search globe icon — good addition.

Icon matches the new entity mapping and sizing pattern.

server/ai/provider/gemini.ts (2)

9-15: Type surface alignment with ai/types

Importing WebSearchSource and GroundingSupport from ../types is good and keeps the provider’s output aligned with ConverseResponse.

197-225: Grounding sources and supports extraction looks correct

Transforms groundingMetadata into first-class sources and supports and returns them with the response.

server/api/chat/chat.ts (1)

4498-4529: Preserve conversation history in webSearchQuestion
Your call at server/api/chat/chat.ts:4504 passes messages: llmFormattedMessages, but the webSearchQuestion provider currently ignores this and rebuilds history. Verify the implementation appends the base message to params.messages so context isn’t lost.

frontend/src/routes/_authenticated/chat.tsx (4)

717-718: Good: flag plumbed into send path.

Passing chatParams.enableWebSearch through to handleSend ensures deep-linking via URL can toggle web search.

744-751: Good: handleSend and startStream accept web-search flag.

The parameter threading is correct and backward-compatible.

Also applies to: 776-786

389-411: VirtualizedMessages integration looks solid.

Including currentResp in allItems prevents layout jumps.

measureElement + absolute positioning is correct with react-virtual.

Keys and memoization look safe.

Also applies to: 2093-2415

641-656: Feedback map extraction tolerant to legacy/new formats.

This compatibility layer is tidy and side-effect free.

server/ai/provider/vertex_ai.ts (1)

121-128: Anthropic paths use the right client; token accounting placeholder is acceptable.

The flow looks correct; cost accounting can be added later.

Also applies to: 156-165

frontend/src/lib/common.tsx

frontend/src/routes/_authenticated/chat.tsx

server/ai/provider/index.ts

server/ai/provider/vertex_ai.ts

server/api/chat/chat.ts

server/search/vespa.ts

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

server/api/chat/chat.ts (4)

3866-3873: Nice: helper is now strongly typed

processWebSearchCitations uses WebSearchSource[] and GroundingSupport[]. Good improvement over any[].

3885-3929: Fix citation insertion index-shift (insert right-to-left or track offset)

Inserting at segment.endIndex mutates string length and invalidates subsequent indices. Sort supports by endIndex desc (or track an offset).

-    for (const support of finalGroundingSupports) {
-      const segment = support.segment
+    // Insert from right to left to avoid shifting downstream indices
+    const supportsSorted = [...finalGroundingSupports]
+      .filter(
+        (s) =>
+          s?.segment &&
+          Number.isFinite((s.segment as any).endIndex as number),
+      )
+      .sort((a, b) => b.segment.endIndex - a.segment.endIndex)
+
+    for (const support of supportsSorted) {
+      const { endIndex } = support.segment
       const groundingChunkIndices = support.groundingChunkIndices || []
 
       let citationText = ""
       for (const chunkIndex of groundingChunkIndices) {
         if (allSources[chunkIndex]) {
           const source = allSources[chunkIndex]
 
           let citationIndex: number
           if (urlToIndexMap.has(source.uri)) {
             // Reuse existing citation index
             citationIndex = urlToIndexMap.get(source.uri)!
           } else {
             citationIndex = sourceIndex
             const webSearchCitation: Citation = {
               docId: `websearch_${sourceIndex}`,
               title: source.title,
               url: source.uri,
               app: Apps.WebSearch,
               entity: WebSearchEntity.WebSearch,
             }
 
             newCitations.push(webSearchCitation)
-            newCitationMap[sourceIndex] =
-              citations.length + newCitations.length - 1
-            urlToIndexMap.set(source.uri, sourceIndex)
+            newCitationMap[citationIndex] =
+              citations.length + newCitations.length - 1
+            urlToIndexMap.set(source.uri, citationIndex)
             sourceIndex++
           }
 
           citationText += ` [${citationIndex}]`
         }
       }
 
-      if (
-        citationText &&
-        segment?.endIndex !== undefined &&
-        segment.endIndex <= answerWithCitations.length
-      ) {
-        answerWithCitations =
-          answerWithCitations.slice(0, segment.endIndex) +
-          citationText +
-          answerWithCitations.slice(segment.endIndex)
-      }
+      if (citationText && Number.isFinite(endIndex)) {
+        const insertAt = Math.min(
+          Math.max(0, endIndex),
+          answerWithCitations.length,
+        )
+        answerWithCitations =
+          answerWithCitations.slice(0, insertAt) +
+          citationText +
+          answerWithCitations.slice(insertAt)
+      }
     }

4509-4518: Disable “reasoning” for Vertex web-search; rely on per-chunk flags

Vertex web-search streams don’t emit Start/EndThinkingToken; leaving reasoning=true routes all tokens to “thinking”.

-              searchOrAnswerIterator = webSearchQuestion(message, ctx, {
+              searchOrAnswerIterator = webSearchQuestion(message, ctx, {
                 modelId: Models.Gemini_2_5_Flash,
                 stream: true,
                 json: false,
                 agentPrompt: agentPromptValue,
-                reasoning:
-                  userRequestsReasoning &&
-                  ragPipelineConfig[RagPipelineStages.AnswerOrSearch].reasoning,
+                // Let provider chunk.reasoning decide; avoid token-gate markers
+                reasoning: false,
                 messages: llmFormattedMessages,
                 webSearch: true,
               })

4594-4630: Route streamed tokens by chunk.reasoning instead of Start/EndThinkingToken

This branch should honor chunk.reasoning; current gating can swallow output with Vertex.

-                if (chunk.text) {
-                  if (reasoning) {
-                    if (thinking && !chunk.text.includes(EndThinkingToken)) {
-                      thinking += chunk.text
-                      stream.writeSSE({
-                        event: ChatSSEvents.Reasoning,
-                        data: chunk.text,
-                      })
-                    } else {
-                      // first time
-                      if (!chunk.text.includes(StartThinkingToken)) {
-                        let token = chunk.text
-                        if (chunk.text.includes(EndThinkingToken)) {
-                          token = chunk.text.split(EndThinkingToken)[0]
-                          thinking += token
-                        } else {
-                          thinking += token
-                        }
-                        stream.writeSSE({
-                          event: ChatSSEvents.Reasoning,
-                          data: token,
-                        })
-                      }
-                    }
-                  }
-                  if (reasoning && chunk.text.includes(EndThinkingToken)) {
-                    reasoning = false
-                    chunk.text = chunk.text.split(EndThinkingToken)[1].trim()
-                  }
-                  if (!reasoning) {
-                    answer += chunk.text
-                    stream.writeSSE({
-                      event: ChatSSEvents.ResponseUpdate,
-                      data: chunk.text,
-                    })
-                  }
-                }
+                if (chunk.text) {
+                  if (chunk.reasoning) {
+                    thinking += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.Reasoning,
+                      data: chunk.text,
+                    })
+                  } else {
+                    answer += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.ResponseUpdate,
+                      data: chunk.text,
+                    })
+                  }
+                }

🧹 Nitpick comments (1)

server/api/chat/chat.ts (1)
223-224: Unify GroundingSupport typing to avoid drift

Prefer importing GroundingSupport from a single internal type source to prevent version/shape drift with the provider SDK.
-import type { GroundingSupport } from "@google/genai"
+import type { GroundingSupport } from "@/ai/types"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d2e44d4 and a0903ca.

📒 Files selected for processing (1)

server/api/chat/chat.ts (8 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

server/api/chat/chat.ts (3)

server/ai/types.ts (2)

WebSearchSource (95-99)

GroundingSupport (101-108)

server/shared/types.ts (3)

Citation (63-63)

Apps (31-31)

WebSearchEntity (36-36)

server/ai/provider/index.ts (3)

webSearchQuestion (1793-1834)

generateSearchQueryOrAnswerFromConversation (1457-1495)

jsonParseLLMOutput (501-628)

🔇 Additional comments (1)

server/api/chat/chat.ts (1)

3866-3940: No duplicate implementations found—no shared util extraction needed

server/api/chat/chat.ts

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

server/package.json (3)
32-32: Optional: pin or vendor‑lock the SDK version to reduce surprise upgrades.

Caret ranges on cloud SDKs can introduce breaking behavior via transitive updates. If stability > velocity, pin the exact version.

Apply within this line:
-    "@google-cloud/vertexai": "^1.10.0",
+    "@google-cloud/vertexai": "1.10.0",
32-32: Operational follow‑ups: IAM, region, and timeouts.

Ensure least-privileged SA and region scoping (e.g., GOOGLE_PROJECT_ID, GOOGLE_LOCATION like us-central1).

Validate request timeouts/retries on Vertex calls and surface them via config.

Add required vars to .env.example and runtime validation.

If not already present in the codebase, I can open a follow-up to add config schema checks for these envs.

32-32: Consolidate Google LLM SDK usage or clearly gate providers. We currently import both @google-cloud/vertexai (in server/ai/provider/vertex_ai.ts) and @google/genai (in server/ai/provider/index.ts and server/api/chat/chat.ts), plus the Anthropic Vertex plugin. If you only need Vertex-backed Gemini, remove the @google/genai dependency and its GeminiAIProvider; otherwise, ensure each provider is cleanly gated and documented.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5e233f4 and 2afd896.

📒 Files selected for processing (1)

server/package.json (1 hunks)

🔇 Additional comments (1)

server/package.json (1)
32-32: Add missing engines field & verify Bun compatibility for @google-cloud/vertexai
The server/package.json has no "engines" block; add:
"engines": {
  "node": ">=18",
  "bun": ">=1.0.30"
}
Manually confirm that the ESM-only @google-cloud/vertexai package (and its dependencies like google-auth-library/gaxios) runs correctly under Bun with no Node-specific gaps.

…earch

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

server/search/vespa.ts (2)
867-869: Escape IDs in YQL to prevent injection and parsing issues

Wrap interpolated IDs with escapeYqlValue(...) (already defined in this file). This is both safer and more robust if an ID ever contains a quote.

Apply this diff:
-      const collectionCondition = `(${collectionIds.map((id: string) => `clId contains '${id.trim()}'`).join(" or ")})`
+      const collectionCondition = `(${collectionIds.map((id: string) => `clId contains '${escapeYqlValue(id.trim())}'`).join(" or ")})`
-          const folderCondition = `(${clVespaIds.map((id: string) => `docId contains '${id.trim()}'`).join(" or ")})`
+          const folderCondition = `(${clVespaIds.map((id: string) => `docId contains '${escapeYqlValue(id.trim())}'`).join(" or ")})`
-        const fileCondition = `(${clVespaIds.map((id: string) => `docId contains '${id.trim()}'`).join(" or ")})`
+        const fileCondition = `(${clVespaIds.map((id: string) => `docId contains '${escapeYqlValue(id.trim())}'`).join(" or ")})`
Also applies to: 881-883, 895-897

904-911: Enforce ACL on KnowledgeBase Vespa query
In buildCollectionFileYQL’s returned YQL (around lines 904–911), append a permissions/ownership guard:
       and ${finalCondition}
+      and (permissions contains @email or owner contains @email)
to restrict KB results to items the current user can access.

♻️ Duplicate comments (4)

server/search/vespa.ts (1)

900-901: Fix: fallback must not broaden scope; default to "false"

If no collection/folder/file IDs resolve, finalCondition should be "false" to avoid unintentionally matching all KB items.

Apply this diff:
-    const finalCondition =
-      conditions.length > 0 ? `(${conditions.join(" or ")})` : "true"
+    const finalCondition =
+      conditions.length > 0 ? `(${conditions.join(" or ")})` : "false"

server/ai/provider/index.ts (1)

1797-1843: webSearchQuestion: disable reasoning for Vertex GOOGLE and avoid per-call provider creation

Vertex GOOGLE streaming doesn’t emit Start/End thinking markers; leaving reasoning on misroutes output downstream. Force params.reasoning = false here.
A new VertexAiProvider is created per call; initialize once and reuse (singleton), similar to other providers. This also avoids repeatedly constructing clients.

+// at module scope with other provider singletons
+let vertexGoogleProviderSingleton: VertexAiProvider | null = null

 // inside initializeProviders() after existing Vertex provider init
 if (VertexProjectId && VertexRegion) {
   // ... existing provider init ...
+  // Always prepare a GOOGLE-backed provider for web search
+  vertexGoogleProviderSingleton = new VertexAiProvider({
+    projectId: VertexProjectId,
+    region: VertexRegion,
+    provider: VertexProvider.GOOGLE,
+  })
+  Logger.info("Initialized VertexAI GOOGLE web-search provider singleton")
 }

 export const webSearchQuestion = (
   query: string,
   userCtx: string,
   params: ModelParams,
 ): AsyncIterableIterator<ConverseResponse> => {
   try {
     if (!params.modelId) {
       params.modelId = defaultBestModel
     }

-    params.webSearch = true
+    params.webSearch = true
+    // Vertex GOOGLE doesn't emit thinking markers; avoid downstream gating.
+    params.reasoning = false
     if (!isAgentPromptEmpty(params.agentPrompt)) {
       params.systemPrompt = agentSearchQueryPrompt(
         userCtx,
         parseAgentPrompt(params.agentPrompt),
       )
     } else if (!params.systemPrompt) {
       params.systemPrompt =
         "You are a helpful AI assistant with access to web search. Use web search when you need current information or real-time data to answer the user's question accurately."
     }
@@
-    if (!config.VertexProjectId || !config.VertexRegion) {
+    if (!config.VertexProjectId || !config.VertexRegion) {
       Logger.warn(
         "VertexProjectId/VertexRegion not configured, moving with default provider.",
       )
       return getProviderByModel(params.modelId).converseStream(messages, params)
     }
-    const vertexGoogleProvider = new VertexAiProvider({
-      projectId: config.VertexProjectId!,
-      region: config.VertexRegion!,
-      provider: VertexProvider.GOOGLE,
-    })
-
-    return vertexGoogleProvider.converseStream(messages, params)
+    if (!vertexGoogleProviderSingleton) {
+      throw new Error("Vertex GOOGLE provider not initialized")
+    }
+    return vertexGoogleProviderSingleton.converseStream(messages, params)

server/api/chat/chat.ts (2)

3908-3982: Fix citation insertion: sort by endIndex descending to avoid shifting; map using citationIndex

Inserting “[n]” left-to-right changes subsequent indices. Insert right-to-left or track offsets. Also record newCitationMap with citationIndex (not sourceIndex) to avoid mismaps when reusing indices.

 function processWebSearchCitations(
   answer: string,
   allSources: WebSearchSource[],
   finalGroundingSupports: GroundingSupport[],
   citations: Citation[],
   citationMap: Record<number, number>,
   sourceIndex: number,
 ): {
@@
-  if (finalGroundingSupports.length > 0 && allSources.length > 0) {
+  if (finalGroundingSupports.length > 0 && allSources.length > 0) {
     let answerWithCitations = answer
     let newCitations: Citation[] = []
     let newCitationMap: Record<number, number> = {}
     let urlToIndexMap: Map<string, number> = new Map()
 
-    for (const support of finalGroundingSupports) {
+    // Insert from right to left to avoid index shifting
+    const supportsSorted = [...finalGroundingSupports].sort(
+      (a, b) => (b.segment?.endIndex ?? 0) - (a.segment?.endIndex ?? 0),
+    )
+    for (const support of supportsSorted) {
       const segment = support.segment
       const groundingChunkIndices = support.groundingChunkIndices || []
 
       let citationText = ""
       for (const chunkIndex of groundingChunkIndices) {
         if (allSources[chunkIndex]) {
           const source = allSources[chunkIndex]
 
           let citationIndex: number
           if (urlToIndexMap.has(source.uri)) {
             // Reuse existing citation index
             citationIndex = urlToIndexMap.get(source.uri)!
           } else {
             citationIndex = sourceIndex
             const webSearchCitation: Citation = {
               docId: `websearch_${sourceIndex}`,
               title: source.title,
               url: source.uri,
               app: Apps.WebSearch,
               entity: WebSearchEntity.WebSearch,
             }
 
             newCitations.push(webSearchCitation)
-            newCitationMap[sourceIndex] =
-              citations.length + newCitations.length - 1
+            newCitationMap[citationIndex] =
+              citations.length + newCitations.length - 1
             urlToIndexMap.set(source.uri, citationIndex)
             sourceIndex++
           }
 
           citationText += ` [${citationIndex}]`
         }
       }
 
       if (
         citationText &&
         segment?.endIndex !== undefined &&
         segment.endIndex <= answerWithCitations.length
       ) {
         answerWithCitations =
           answerWithCitations.slice(0, segment.endIndex) +
           citationText +
           answerWithCitations.slice(segment.endIndex)
       }
     }
 
     return {
       updatedAnswer: answerWithCitations,
       newCitations,
       newCitationMap,
       updatedSourceIndex: sourceIndex,
     }
   }
 
   return null
 }

Also consider de-duping this helper (appears twice per PR summary) into a shared util to avoid divergence.

4587-4630: Web-search streaming: route by chunk.reasoning and pass reasoning=false to Vertex

Current branch gates on Start/EndThinking tokens; Vertex GOOGLE doesn’t emit them, so output may be swallowed/misrouted.
Use chunk.reasoning when present; otherwise treat all text as response. Also call webSearchQuestion with reasoning: false.

-              searchOrAnswerIterator = webSearchQuestion(message, ctx, {
+              searchOrAnswerIterator = webSearchQuestion(message, ctx, {
                 modelId: Models.Gemini_2_5_Flash,
                 stream: true,
-                json: false,
+                json: false,
                 agentPrompt: agentPromptValue,
-                reasoning:
-                  userRequestsReasoning &&
-                  ragPipelineConfig[RagPipelineStages.AnswerOrSearch].reasoning,
+                // Vertex GOOGLE: disable local reasoning gating
+                reasoning: false,
                 messages: llmFormattedMessages,
                 webSearch: true,
               })

-                if (chunk.text) {
-                  if (reasoning) {
-                    if (thinking && !chunk.text.includes(EndThinkingToken)) {
-                      thinking += chunk.text
-                      stream.writeSSE({
-                        event: ChatSSEvents.Reasoning,
-                        data: chunk.text,
-                      })
-                    } else {
-                      // first time
-                      if (!chunk.text.includes(StartThinkingToken)) {
-                        let token = chunk.text
-                        if (chunk.text.includes(EndThinkingToken)) {
-                          token = chunk.text.split(EndThinkingToken)[0]
-                          thinking += token
-                        } else {
-                          thinking += token
-                        }
-                        stream.writeSSE({
-                          event: ChatSSEvents.Reasoning,
-                          data: token,
-                        })
-                      }
-                    }
-                  }
-                  if (reasoning && chunk.text.includes(EndThinkingToken)) {
-                    reasoning = false
-                    chunk.text = chunk.text.split(EndThinkingToken)[1].trim()
-                  }
-                  if (!reasoning) {
-                    answer += chunk.text
-                    stream.writeSSE({
-                      event: ChatSSEvents.ResponseUpdate,
-                      data: chunk.text,
-                    })
-                  }
-                }
+                if (chunk.text) {
+                  if (chunk.reasoning) {
+                    thinking += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.Reasoning,
+                      data: chunk.text,
+                    })
+                  } else {
+                    answer += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.ResponseUpdate,
+                      data: chunk.text,
+                    })
+                  }
+                }

Also applies to: 4669-4789

🧹 Nitpick comments (4)

server/search/vespa.ts (1)

852-862: Deduplicate and sanitize input IDs before building conditions

Avoid redundant OR terms and trim early; this keeps YQL smaller and faster.

Apply this diff:

   for (const selection of collectionSelections) {
     if (selection.collectionIds) {
       collectionIds.push(...selection.collectionIds)
     }
     if (selection.collectionFolderIds) {
       collectionFolderIds.push(...selection.collectionFolderIds)
     }
     if (selection.collectionFileIds) {
       collectionFileIds.push(...selection.collectionFileIds)
     }
   }
+  // Deduplicate + trim
+  const uniqueCollectionIds = [...new Set(collectionIds.map((v) => v?.trim()).filter(Boolean))]
+  const uniqueCollectionFolderIds = [...new Set(collectionFolderIds.map((v) => v?.trim()).filter(Boolean))]
+  const uniqueCollectionFileIds = [...new Set(collectionFileIds.map((v) => v?.trim()).filter(Boolean))]

Then update subsequent checks to use the deduped arrays:

-  if (collectionIds.length > 0) {
-    const collectionCondition = `(${collectionIds.map((id: string) => `clId contains '${escapeYqlValue(id.trim())}'`).join(" or ")})`
+  if (uniqueCollectionIds.length > 0) {
+    const collectionCondition = `(${uniqueCollectionIds.map((id: string) => `clId contains '${escapeYqlValue(id)}'`).join(" or ")})`

-  if (collectionFolderIds.length > 0) {
-    const clFileIds = await getAllFolderItems(collectionFolderIds, db)
+  if (uniqueCollectionFolderIds.length > 0) {
+    const clFileIds = await getAllFolderItems(uniqueCollectionFolderIds, db)

-  if (collectionFileIds.length > 0) {
-    const ids = await getCollectionFilesVespaIds(collectionFileIds, db)
+  if (uniqueCollectionFileIds.length > 0) {
+    const ids = await getCollectionFilesVespaIds(uniqueCollectionFileIds, db)

server/ai/provider/index.ts (1)

282-290: VERTEX_PROVIDER mapping: log enum name, not numeric; handle invalid values explicitly

Current log will print a number (enum value). Prefer the enum key for readability.

Optional: log a warning when VERTEX_PROVIDER is set but invalid (you already fall back).
-    const provider =
+    const provider =
       vertexProviderType && VertexProvider[vertexProviderType]
         ? VertexProvider[vertexProviderType]
         : VertexProvider.ANTHROPIC

-    Logger.info(`Initialized VertexAI provider with ${provider} backend`)
+    const providerName =
+      vertexProviderType && VertexProvider[vertexProviderType]
+        ? vertexProviderType
+        : "ANTHROPIC"
+    Logger.info(`Initialized VertexAI provider with ${providerName} backend`)
Also applies to: 297-297

server/api/chat/chat.ts (2)

4547-4568: Parsing prior classification: optional robustness

Minor: narrow the try/catch to just JSON.parse and guard with a cheap startsWith("{") before parsing to reduce noise in logs. Current code is otherwise fine.

5531-5531: Variable shadowing nit

const prevUserMessage is declared inside this if block and again later in the outer scope. Rename the inner one (e.g., prevUserMsgForAttachments) to avoid confusion.
-      const prevUserMessage = conversation[conversation.length - 1]
+      const prevUserMsgForAttachments = conversation[conversation.length - 1]
       if (prevUserMessage.messageRole === "user") {
-        attachmentMetadata = await getAttachmentsByMessageId(
+        attachmentMetadata = await getAttachmentsByMessageId(
           db,
-          prevUserMessage.externalId,
+          prevUserMsgForAttachments.externalId,
           email,
         )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2afd896 and f8d1ee9.

📒 Files selected for processing (5)

server/ai/provider/index.ts (3 hunks)
server/ai/types.ts (1 hunks)
server/api/chat/chat.ts (14 hunks)
server/package.json (1 hunks)
server/search/vespa.ts (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

server/ai/types.ts
server/package.json

🧰 Additional context used

🧬 Code graph analysis (2)

server/ai/provider/index.ts (3)

server/ai/provider/vertex_ai.ts (1)

VertexAiProvider (67-485)

server/ai/types.ts (2)

ModelParams (69-84)

ConverseResponse (86-93)

server/ai/agentPrompts.ts (1)

agentSearchQueryPrompt (1026-1346)

server/api/chat/chat.ts (4)

server/ai/types.ts (3)

WebSearchSource (95-99)

GroundingSupport (101-108)

QueryRouterLLMResponse (212-212)

server/api/chat/utils.ts (2)

getRecentChainBreakClassifications (917-922)

formatChainBreaksForPrompt (958-973)

server/db/schema/messages.ts (2)

messages (31-75)

selectMessageSchema (84-84)

server/ai/provider/index.ts (3)

webSearchQuestion (1797-1843)

generateSearchQueryOrAnswerFromConversation (1459-1499)

jsonParseLLMOutput (503-630)

🔇 Additional comments (6)

server/ai/provider/index.ts (1)

98-98: Expose VertexProvider: LGTM

Re-exporting VertexProvider is fine and unblocks callers.

server/api/chat/chat.ts (5)

22-22: Type-safe web search wiring: LGTM

Imports for webSearchQuestion, WebSearchSource, WebSearchEntity, and GroundingSupport look correct.

Also applies to: 40-41, 143-144, 228-229

2909-2913: Pass-through to processIterator: LGTM

The added args keep semantics consistent with other callsites.

5936-5957: Retry path chain-break/prev-classification: LGTM

Good reuse of prior classification and localized chain-break window.

Also applies to: 5963-5971

6147-6149: Classification filters defaults: LGTM

Using parsed.filters.offset || 0 and intent || {} is reasonable.

4013-4014: Ignore chatSchema suggestion enableWebSearch is already defined in messageSchema (and inferred by MessageReqType), so it will be parsed correctly.

Likely an incorrect or invalid review comment.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

server/api/chat/chat.ts (1)
4081-4082: Block-scoped redeclaration: ‘chat’ is declared twice in MessageApi

You declare “let chat: SelectChat” at Line 3998 and again at Line 4082 in the same try block; TS will error.

Apply:
-    let chat: SelectChat
+    // reuse the top-level 'chat' declared earlier in this scope

♻️ Duplicate comments (2)

server/api/chat/chat.ts (2)

3909-3988: Fix citation insertion index shifting (insert right-to-left and de-dup per segment)

Inserting “[n]” left-to-right mutates subsequent indices; repeated URLs within a segment can also duplicate “[n]”. Sort by segment.endIndex descending and build a unique set per segment.

Apply:

-    for (const support of finalGroundingSupports) {
+    const supportsSorted = [...finalGroundingSupports].sort(
+      (a, b) => (b.segment?.endIndex ?? 0) - (a.segment?.endIndex ?? 0),
+    )
+    for (const support of supportsSorted) {
       const segment = support.segment
       const groundingChunkIndices = support.groundingChunkIndices || []

-      let citationText = ""
+      const citationIndices: number[] = []
       for (const chunkIndex of groundingChunkIndices) {
         if (allSources[chunkIndex]) {
           const source = allSources[chunkIndex]

           let citationIndex: number
           if (urlToIndexMap.has(source.uri)) {
             // Reuse existing citation index
             citationIndex = urlToIndexMap.get(source.uri)!
           } else {
             citationIndex = sourceIndex
             const webSearchCitation: Citation = {
               docId: `websearch_${sourceIndex}`,
               title: source.title,
               url: source.uri,
               app: Apps.WebSearch,
               entity: WebSearchEntity.WebSearch,
             }

             newCitations.push(webSearchCitation)
-            newCitationMap[sourceIndex] =
-              citations.length + newCitations.length - 1
+            newCitationMap[citationIndex] =
+              citations.length + newCitations.length - 1
             urlToIndexMap.set(source.uri, sourceIndex)
             sourceIndex++
           }

-          citationText += ` [${citationIndex}]`
+          citationIndices.push(citationIndex)
         }
       }

+      const uniqueCitationIndices = [...new Set(citationIndices)]
+      const citationText =
+        uniqueCitationIndices.length > 0
+          ? uniqueCitationIndices.map((i) => ` [${i}]`).join("")
+          : ""
       if (
         citationText &&
         segment?.endIndex !== undefined &&
         segment.endIndex <= answerWithCitations.length
       ) {
         // Find optimal insertion point that respects word boundaries
         const optimalIndex = findOptimalCitationInsertionPoint(
           answerWithCitations,
           segment.endIndex,
         )
         answerWithCitations =
           answerWithCitations.slice(0, optimalIndex) +
           citationText +
           answerWithCitations.slice(optimalIndex)
       }
     }

#!/bin/bash
# Ensure only a single implementation exists or update all copies.
rg -nP 'function\s+processWebSearchCitations\s*\(' -C2

4699-4734: Web-search streaming: all tokens routed to Reasoning (Vertex doesn’t emit Start/EndThinkingToken)

This branch gates on Start/EndThinkingToken; Vertex chunks typically set chunk.reasoning instead. Today, reasoning may never flip to false → no ResponseUpdate.

Apply:

-                if (chunk.text) {
-                  if (reasoning) {
-                    if (thinking && !chunk.text.includes(EndThinkingToken)) {
-                      thinking += chunk.text
-                      stream.writeSSE({
-                        event: ChatSSEvents.Reasoning,
-                        data: chunk.text,
-                      })
-                    } else {
-                      // first time
-                      if (!chunk.text.includes(StartThinkingToken)) {
-                        let token = chunk.text
-                        if (chunk.text.includes(EndThinkingToken)) {
-                          token = chunk.text.split(EndThinkingToken)[0]
-                          thinking += token
-                        } else {
-                          thinking += token
-                        }
-                        stream.writeSSE({
-                          event: ChatSSEvents.Reasoning,
-                          data: token,
-                        })
-                      }
-                    }
-                  }
-                  if (reasoning && chunk.text.includes(EndThinkingToken)) {
-                    reasoning = false
-                    chunk.text = chunk.text.split(EndThinkingToken)[1].trim()
-                  }
-                  if (!reasoning) {
-                    answer += chunk.text
-                    stream.writeSSE({
-                      event: ChatSSEvents.ResponseUpdate,
-                      data: chunk.text,
-                    })
-                  }
-                }
+                if (chunk.text) {
+                  if (chunk.reasoning) {
+                    thinking += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.Reasoning,
+                      data: chunk.text,
+                    })
+                  } else {
+                    answer += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.ResponseUpdate,
+                      data: chunk.text,
+                    })
+                  }
+                }

🧹 Nitpick comments (1)

server/api/chat/utils.ts (1)

1001-1045: Comment contradicts implementation; clarify or adjust tie-break

Code favors the left boundary on ties, but the comment says “lean towards right.” Either update the comment or flip the conditional.

Apply one of:

-  // Prefer the closer boundary, but lean towards right boundary (end of word) for better readability
-  if (leftDistance <= rightDistance || rightBoundary >= text.length) {
+  // Prefer the closer boundary; lean towards left boundary on ties
+  if (leftDistance <= rightDistance || rightBoundary >= text.length) {
     return leftBoundary
   } else {
     return rightBoundary
   }

or, to actually lean right:

-  if (leftDistance <= rightDistance || rightBoundary >= text.length) {
+  if (rightDistance <= leftDistance || rightBoundary >= text.length) {
     return leftBoundary
   } else {
     return rightBoundary
   }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f8d1ee9 and 04fde68.

📒 Files selected for processing (2)

server/api/chat/chat.ts (15 hunks)
server/api/chat/utils.ts (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

server/api/chat/chat.ts (4)

server/ai/types.ts (2)

WebSearchSource (95-99)

GroundingSupport (101-108)

server/api/chat/types.ts (1)

Citation (76-76)

server/api/chat/utils.ts (1)

findOptimalCitationInsertionPoint (1001-1045)

server/ai/provider/index.ts (1)

webSearchQuestion (1797-1843)

server/api/chat/utils.ts (3)

server/ai/types.ts (1)

QueryRouterLLMResponse (212-212)

server/db/schema/messages.ts (2)

messages (31-75)

SelectMessage (85-85)

server/logger/index.ts (1)

getLoggerWithChild (189-197)

🔇 Additional comments (8)

server/api/chat/utils.ts (5)

888-892: ChainBreakClassification: style-only change looks good

No behavioral impact; types remain the same.

904-908: Stronger type guard—good

Rejecting non-object/array payloads prevents misclassification noise.

921-930: Recent chain-break limiter—good

Taking the two most recent entries is sensible for prompt budget.

933-979: Chain-break extraction logic—OK with implicit “user/assistant/user” stride

Assumes alternating roles (index - 2). If mixed-role sequences are possible (e.g., tool messages), consider walking back to the previous “user” instead of index math.

981-999: Null vs object return—good

Returning null when none exist keeps prompts lean.

server/api/chat/chat.ts (3)

2910-2914: Good: unified streaming path for metadata answers

Passing items and reasoning flag through processIterator keeps behavior consistent.

4593-4636: Feature flag plumbing—OK

webSearchEnabled path is correctly gated and preserves the non-web path.

6153-6155: Retry classification: safe defaults

Using parsed.filters.offset || 0 and intent || {} avoids undefineds.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

server/api/chat/chat.ts (3)

229-231: Nice: typed GroundingSupport/WebSearchSource improve safety

Imports and function signature now use precise types, removing any[]. LGTM.

Also applies to: 3911-3915

3909-3988: Citations can be inserted at wrong places due to shifting indices; insert right-to-left

Inserting “[n]” mutates answer length and invalidates subsequent segment.endIndex. Sort supports by endIndex desc (or track cumulative offset). Also map using citationIndex for clarity.

 function processWebSearchCitations(
   answer: string,
   allSources: WebSearchSource[],
   finalGroundingSupports: GroundingSupport[],
   citations: Citation[],
   citationMap: Record<number, number>,
   sourceIndex: number,
 ): {
@@
-  if (finalGroundingSupports.length > 0 && allSources.length > 0) {
+  if (finalGroundingSupports.length > 0 && allSources.length > 0) {
     let answerWithCitations = answer
     let newCitations: Citation[] = []
     let newCitationMap: Record<number, number> = {}
     let urlToIndexMap: Map<string, number> = new Map()
 
-    for (const support of finalGroundingSupports) {
+    // Insert from right to left to avoid index shifting
+    const supportsSorted = [...finalGroundingSupports].sort(
+      (a, b) => (b.segment?.endIndex ?? 0) - (a.segment?.endIndex ?? 0),
+    )
+    for (const support of supportsSorted) {
       const segment = support.segment
       const groundingChunkIndices = support.groundingChunkIndices || []
 
       let citationText = ""
       for (const chunkIndex of groundingChunkIndices) {
         if (allSources[chunkIndex]) {
           const source = allSources[chunkIndex]
 
           let citationIndex: number
           if (urlToIndexMap.has(source.uri)) {
             // Reuse existing citation index
             citationIndex = urlToIndexMap.get(source.uri)!
           } else {
             citationIndex = sourceIndex
             const webSearchCitation: Citation = {
               docId: `websearch_${sourceIndex}`,
               title: source.title,
               url: source.uri,
               app: Apps.WebSearch,
               entity: WebSearchEntity.WebSearch,
             }
 
             newCitations.push(webSearchCitation)
-            newCitationMap[sourceIndex] =
-              citations.length + newCitations.length - 1
-            urlToIndexMap.set(source.uri, sourceIndex)
+            newCitationMap[citationIndex] =
+              citations.length + newCitations.length - 1
+            urlToIndexMap.set(source.uri, citationIndex)
             sourceIndex++
           }
 
           citationText += ` [${citationIndex}]`
         }
       }
 
       if (
         citationText &&
         segment?.endIndex !== undefined &&
         segment.endIndex <= answerWithCitations.length
       ) {
         // Find optimal insertion point that respects word boundaries
         const optimalIndex = findOptimalCitationInsertionPoint(
           answerWithCitations,
           segment.endIndex,
         )
         answerWithCitations =
           answerWithCitations.slice(0, optimalIndex) +
           citationText +
           answerWithCitations.slice(optimalIndex)
       }
     }

4698-4734: Vertex web-search streaming: all tokens go to Reasoning; no ResponseUpdate emitted

This branch relies on Start/EndThinkingToken, which Vertex web search doesn’t emit. Result: ResponseUpdate is starved.

-                if (chunk.text) {
-                  if (reasoning) {
-                    if (thinking && !chunk.text.includes(EndThinkingToken)) {
-                      thinking += chunk.text
-                      stream.writeSSE({
-                        event: ChatSSEvents.Reasoning,
-                        data: chunk.text,
-                      })
-                    } else {
-                      // first time
-                      if (!chunk.text.includes(StartThinkingToken)) {
-                        let token = chunk.text
-                        if (chunk.text.includes(EndThinkingToken)) {
-                          token = chunk.text.split(EndThinkingToken)[0]
-                          thinking += token
-                        } else {
-                          thinking += token
-                        }
-                        stream.writeSSE({
-                          event: ChatSSEvents.Reasoning,
-                          data: token,
-                        })
-                      }
-                    }
-                  }
-                  if (reasoning && chunk.text.includes(EndThinkingToken)) {
-                    reasoning = false
-                    chunk.text = chunk.text.split(EndThinkingToken)[1].trim()
-                  }
-                  if (!reasoning) {
-                    answer += chunk.text
-                    stream.writeSSE({
-                      event: ChatSSEvents.ResponseUpdate,
-                      data: chunk.text,
-                    })
-                  }
-                }
+                if (chunk.text) {
+                  if (chunk.reasoning) {
+                    thinking += chunk.text
+                    await stream.writeSSE({
+                      event: ChatSSEvents.Reasoning,
+                      data: chunk.text,
+                    })
+                  } else {
+                    answer += chunk.text
+                    await stream.writeSSE({
+                      event: ChatSSEvents.ResponseUpdate,
+                      data: chunk.text,
+                    })
+                  }
+                }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 04fde68 and 0d41941.

📒 Files selected for processing (2)

server/ai/provider/index.ts (3 hunks)
server/api/chat/chat.ts (16 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

server/ai/provider/index.ts

🧰 Additional context used

🧬 Code graph analysis (1)

server/api/chat/chat.ts (4)

server/ai/types.ts (3)

WebSearchSource (95-99)

GroundingSupport (101-108)

QueryRouterLLMResponse (212-212)

server/api/chat/types.ts (1)

Citation (76-76)

server/api/chat/utils.ts (1)

findOptimalCitationInsertionPoint (1001-1045)

server/ai/provider/index.ts (2)

webSearchQuestion (1797-1841)

generateSearchQueryOrAnswerFromConversation (1459-1499)

server/api/chat/chat.ts

…earch

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

server/api/chat/chat.ts (4)

4019-4023: Plumbing: ensure enableWebSearch is in the validated request schema and frontend

The handler reads enableWebSearch from the request; verify the schema (message-level) and types include it, and that the client sends it.

#!/bin/bash
# Verify schema/type plumbing and call sites for enableWebSearch
rg -nP 'messageSchema|MessageReqType' server/api/search.ts -C3 || true
rg -n 'enableWebSearch' -- server frontend || true

Also applies to: 4593-4636

3909-3988: Fix index drift: insert citations from right-to-left

Each insertion shifts subsequent indices; sort by segment.endIndex descending before splicing. Also map using citationIndex for clarity.

-  for (const support of finalGroundingSupports) {
+  const supportsSorted = [...finalGroundingSupports].sort(
+    (a, b) => (b.segment?.endIndex ?? 0) - (a.segment?.endIndex ?? 0),
+  )
+  for (const support of supportsSorted) {
     const segment = support.segment
     const groundingChunkIndices = support.groundingChunkIndices || []

     let citationText = ""
     for (const chunkIndex of groundingChunkIndices) {
       if (allSources[chunkIndex]) {
         const source = allSources[chunkIndex]

         let citationIndex: number
         if (urlToIndexMap.has(source.uri)) {
           // Reuse existing citation index
           citationIndex = urlToIndexMap.get(source.uri)!
         } else {
           citationIndex = sourceIndex
           const webSearchCitation: Citation = {
             docId: `websearch_${sourceIndex}`,
             title: source.title,
             url: source.uri,
             app: Apps.WebSearch,
             entity: WebSearchEntity.WebSearch,
           }

           newCitations.push(webSearchCitation)
-          newCitationMap[sourceIndex] =
-            citations.length + newCitations.length - 1
+          newCitationMap[citationIndex] =
+            citations.length + newCitations.length - 1
           urlToIndexMap.set(source.uri, sourceIndex)
           sourceIndex++
         }

         citationText += ` [${citationIndex}]`
       }
     }

4698-4734: Streaming reasoning gating for web search should use chunk.reasoning

Vertex (GOOGLE) doesn’t emit Start/EndThinkingToken. Route tokens via chunk.reasoning so answers don’t get stuck in “Reasoning.”

-                if (chunk.text) {
-                  if (reasoning) {
-                    if (thinking && !chunk.text.includes(EndThinkingToken)) {
-                      thinking += chunk.text
-                      stream.writeSSE({
-                        event: ChatSSEvents.Reasoning,
-                        data: chunk.text,
-                      })
-                    } else {
-                      // first time
-                      if (!chunk.text.includes(StartThinkingToken)) {
-                        let token = chunk.text
-                        if (chunk.text.includes(EndThinkingToken)) {
-                          token = chunk.text.split(EndThinkingToken)[0]
-                          thinking += token
-                        } else {
-                          thinking += token
-                        }
-                        stream.writeSSE({
-                          event: ChatSSEvents.Reasoning,
-                          data: token,
-                        })
-                      }
-                    }
-                  }
-                  if (reasoning && chunk.text.includes(EndThinkingToken)) {
-                    reasoning = false
-                    chunk.text = chunk.text.split(EndThinkingToken)[1].trim()
-                  }
-                  if (!reasoning) {
-                    answer += chunk.text
-                    stream.writeSSE({
-                      event: ChatSSEvents.ResponseUpdate,
-                      data: chunk.text,
-                    })
-                  }
-                }
+                if (chunk.text) {
+                  if (chunk.reasoning) {
+                    thinking += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.Reasoning,
+                      data: chunk.text,
+                    })
+                  } else {
+                    answer += chunk.text
+                    stream.writeSSE({
+                      event: ChatSSEvents.ResponseUpdate,
+                      data: chunk.text,
+                    })
+                  }
+                }

2910-2914: Bug: Missing email in processIterator causes image-citation failures

processIterator dereferences email when emitting image citations; passing undefined can break streaming.

   return yield* processIterator(
     iterator,
     items,
     0,
     config.isReasoning && userRequestsReasoning,
+    email,
   )

🧹 Nitpick comments (1)

server/api/chat/chat.ts (1)
229-229: Use internal GroundingSupport type to decouple from provider SDK

Prefer the shared interface from "@/ai/types" to avoid lock-in to @google/genai shapes and to keep server types consistent.
- import type { GroundingSupport } from "@google/genai"
+ import type { GroundingSupport } from "@/ai/types"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 0d41941 and 629043c.

📒 Files selected for processing (2)

server/api/chat/chat.ts (16 hunks)
server/api/search.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

server/api/search.ts

🧰 Additional context used

🧬 Code graph analysis (1)

server/api/chat/chat.ts (4)

server/ai/types.ts (2)

WebSearchSource (95-99)

GroundingSupport (101-108)

server/api/chat/types.ts (1)

Citation (76-76)

server/api/chat/utils.ts (1)

findOptimalCitationInsertionPoint (1001-1045)

server/ai/provider/index.ts (1)

webSearchQuestion (1797-1841)

🔇 Additional comments (1)

server/api/chat/chat.ts (1)

205-206: Good call: word-boundary-safe citation insertion helper

Importing and using findOptimalCitationInsertionPoint improves readability and avoids mid-word insertions.

server/api/chat/chat.ts

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

server/api/chat/chat.ts (3)

2910-2914: Pass email into processIterator to prevent image-citation streaming bugs

processIterator dereferences email when emitting image citations. Omit -> runtime errors or missing images.

Apply:

   return yield* processIterator(
     iterator,
     items,
     0,
-    config.isReasoning && userRequestsReasoning,
+    config.isReasoning && userRequestsReasoning,
+    email,
   )

4706-4723: Grounding index/source mismatch: align sources array with final supports

You dedupe into allSources across chunks, but the final groundingChunkIndices refer to the final chunk’s sources array. This can point to the wrong URL/title.

-                if (chunk.sources && chunk.sources.length > 0) {
+                if (chunk.sources && chunk.sources.length > 0) {
                   chunk.sources.forEach((source) => {
                     if (
                       !allSources.some(
                         (existing) => existing.uri === source.uri,
                       )
                     ) {
                       allSources.push(source)
                     }
                   })
                 }
                 if (
                   chunk.groundingSupports &&
                   chunk.groundingSupports.length > 0
                 ) {
-                  finalGroundingSupports = chunk.groundingSupports
+                  finalGroundingSupports = chunk.groundingSupports
+                  // IMPORTANT: keep indices aligned with the provider's final sources
+                  if (chunk.sources && chunk.sources.length > 0) {
+                    allSources = chunk.sources
+                  }
                 }

3909-3988: Citation insertion order can corrupt indices; insert right-to-left

Each insertion shifts subsequent endIndex positions. Sort supports by segment.endIndex desc (or track cumulative offset).

Apply:

-  for (const support of finalGroundingSupports) {
+  // Insert from right to left to avoid index shifting
+  const supportsSorted = [...finalGroundingSupports].sort(
+    (a, b) => (b.segment?.endIndex ?? 0) - (a.segment?.endIndex ?? 0),
+  )
+  for (const support of supportsSorted) {
     const segment = support.segment
     const groundingChunkIndices = support.groundingChunkIndices || []

     let citationText = ""
     for (const chunkIndex of groundingChunkIndices) {
       if (allSources[chunkIndex]) {
         const source = allSources[chunkIndex]

         let citationIndex: number
         if (urlToIndexMap.has(source.uri)) {
           // Reuse existing citation index
           citationIndex = urlToIndexMap.get(source.uri)!
         } else {
           citationIndex = sourceIndex
           const webSearchCitation: Citation = {
             docId: `websearch_${sourceIndex}`,
             title: source.title,
             url: source.uri,
             app: Apps.WebSearch,
             entity: WebSearchEntity.WebSearch,
           }

           newCitations.push(webSearchCitation)
-          newCitationMap[sourceIndex] =
-            citations.length + newCitations.length - 1
+          newCitationMap[citationIndex] =
+            citations.length + newCitations.length - 1
           urlToIndexMap.set(source.uri, sourceIndex)
           sourceIndex++
         }

         citationText += ` [${citationIndex}]`
       }
     }

     if (
       citationText &&
       segment?.endIndex !== undefined &&
       segment.endIndex <= answerWithCitations.length
     ) {
       // Find optimal insertion point that respects word boundaries
       const optimalIndex = findOptimalCitationInsertionPoint(
         answerWithCitations,
         segment.endIndex,
       )
       answerWithCitations =
         answerWithCitations.slice(0, optimalIndex) +
         citationText +
         answerWithCitations.slice(optimalIndex)
     }
   }

🧹 Nitpick comments (2)

server/api/chat/chat.ts (2)

4697-4705: Stream reasoning tokens if provider sets chunk.reasoning

Vertex GOOGLE may not set Start/EndThinkingToken. Prefer per-chunk flag when present; today all text goes to ResponseUpdate.

-                if (chunk.text) {
-                  answer += chunk.text
-                  stream.writeSSE({
-                    event: ChatSSEvents.ResponseUpdate,
-                    data: chunk.text,
-                  })
-                }
+                if (chunk.text) {
+                  if (chunk.reasoning) {
+                    await stream.writeSSE({
+                      event: ChatSSEvents.Reasoning,
+                      data: chunk.text,
+                    })
+                  } else {
+                    answer += chunk.text
+                    await stream.writeSSE({
+                      event: ChatSSEvents.ResponseUpdate,
+                      data: chunk.text,
+                    })
+                  }
+                }

5507-5517: Shadowed variable name reduces clarity

prevUserMessage here shadows the outer prevUserMessage defined later in the function. Rename to avoid confusion.

-      const prevUserMessage = conversation[conversation.length - 1]
+      const lastConvUserMessage = conversation[conversation.length - 1]

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 629043c and a32f0a7.

📒 Files selected for processing (1)

server/api/chat/chat.ts (16 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

server/api/chat/chat.ts (4)

server/ai/types.ts (3)

WebSearchSource (95-99)

GroundingSupport (101-108)

QueryRouterLLMResponse (212-212)

server/api/chat/types.ts (2)

Citation (76-76)

ragPipelineConfig (35-63)

server/api/chat/utils.ts (1)

findOptimalCitationInsertionPoint (1001-1045)

server/ai/provider/index.ts (2)

webSearchQuestion (1797-1841)

generateSearchQueryOrAnswerFromConversation (1459-1499)

🔇 Additional comments (2)

server/api/chat/chat.ts (2)

229-229: Use local GroundingSupport in chat.ts
Line 229 currently imports GroundingSupport from “@google/genai” but the project defines its own interface at server/ai/types.ts:101. Confirm the external and local definitions match exactly—or replace the external import with the local type to avoid runtime typing mismatches.

4593-4611: enableWebSearch is plumbed correctly. The server’s messageSchema includes an enableWebSearch string-to-boolean transform and MessageReqType exposes it, and the frontend always appends enableWebSearch to the request URL.

junaid-shirur added 2 commits September 1, 2025 16:52

support web_search

db5c568

update vertex provider to support gemini grounding

d2e44d4

junaid-shirur requested review from kalpadhwaryu, shivamashtikar and zereraz as code owners September 1, 2025 11:31

gemini-code-assist bot reviewed Sep 1, 2025

View reviewed changes

frontend/src/routes/_authenticated/chat.tsx Show resolved Hide resolved

server/ai/provider/gemini.ts Show resolved Hide resolved

server/ai/provider/index.ts Show resolved Hide resolved

server/api/chat/chat.ts Show resolved Hide resolved

coderabbitai bot reviewed Sep 1, 2025

View reviewed changes

remove unwanted comment

a0903ca

coderabbitai bot reviewed Sep 1, 2025

View reviewed changes

server/api/chat/chat.ts Show resolved Hide resolved

junaid-shirur added 2 commits September 1, 2025 17:58

fall back to default model when vertex is not configured

5e233f4

update google cloud vertex-ai package

2afd896

coderabbitai bot reviewed Sep 1, 2025

View reviewed changes

Merge branch 'main' of https://github.com/xynehq/xyne into feat/web_s…

f8d1ee9

…earch

coderabbitai bot reviewed Sep 1, 2025

View reviewed changes

fix citation in-between word

04fde68

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

add web-search for agent chats

0d41941

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

server/api/chat/chat.ts Show resolved Hide resolved

server/api/chat/chat.ts Show resolved Hide resolved

kalpadhwaryu approved these changes Sep 2, 2025

View reviewed changes

Merge branch 'main' of https://github.com/xynehq/xyne into feat/web_s…

629043c

…earch

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

server/api/chat/chat.ts Show resolved Hide resolved

remove web-searching reasoning

a32f0a7

coderabbitai bot reviewed Sep 2, 2025

View reviewed changes

zereraz merged commit 67c33b5 into main Sep 2, 2025
4 checks passed

zereraz deleted the feat/web_search branch September 2, 2025 08:22

This was referenced Sep 3, 2025

feat(deep-research): Add DeepResearch support via OpenAI #761

Merged

Feat/Document Chat for KB with Chunk Citations #762

Merged

feat/model-selection #765

Merged

coderabbitai bot mentioned this pull request Oct 1, 2025

fix(drive-search): fixed drive search #1017

Merged

feat(chats): Enable web search via Vertex Gemini model #751

feat(chats): Enable web search via Vertex Gemini model #751

Uh oh!

Conversation

junaid-shirur commented Sep 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Additional Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

junaid-shirur commented Sep 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 1, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)