-
Notifications
You must be signed in to change notification settings - Fork 56
Fix/follow up context #1082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/follow up context #1082
Conversation
Summary of ChangesHello @Himanshvarma, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors and enhances the system's ability to handle follow-up queries, especially when they involve attached files. It introduces a standardized way to process message history, ensuring that even failed user queries contribute to a consistent conversation context. Furthermore, it improves the reliability of temporary file cleanup operations within the DuckDB integration, contributing to overall system stability. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Caution Review failedThe pull request is closed. Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThreads an Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User
participant API as Chat API
participant T as transformMessagesWithErrorHandling
participant C as collectFollowupContext
participant CL as Classifier
participant UAC as UnderstandMessageAndAnswerForGivenContext
participant UA as UnderstandMessageAndAnswer
U->>API: send message + history
API->>T: transformMessagesWithErrorHandling(messages)
T-->>API: transformedMessages
API->>C: collectFollowupContext(transformedMessages)
C-->>API: workingSet {fileIds, imageAttachmentFileIds, ...}
alt isFollowUp && file-backed context
API->>UAC: call with fileIds/imageAttachmentFileIds (isFollowUp=true)
UAC-->>API: file-context answer
else isFollowUp without file context
API->>CL: classify(query, include isFollowUp)
CL-->>API: classification
API->>UA: call UnderstandMessageAndAnswer(using classification)
UA-->>API: answer
end
API-->>U: response (logs include isFollowUp and carried file/attachment IDs)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the follow-up context gathering logic, centralizing it into new utility functions and applying it to both the standard chat and agent-based chat APIs. A new helper transformMessagesWithErrorHandling
is introduced to normalize conversation history by injecting synthetic error responses. While the refactoring is a good step towards reducing code duplication and improving clarity, I've identified a few critical issues. There's a regression where file context from the current message is being overwritten instead of merged during follow-up queries. Additionally, a new utility function contains a logical bug in its condition that could lead to incorrect conversation history processing. I've also pointed out a minor maintainability issue with a brittle string check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/api/chat/utils.ts (1)
98-145
: Chain-break boundary applied after processing; stop before including boundary messageYou currently process the chain-break message’s attachments/sources and then break, which can leak previous-topic context. Break before processing the boundary index.
for (let i = startIdx; i >= 0 && hops < maxHops; i--, hops++) { const m = messages[i]; + // Stop if we hit a chain break (previous conversation topic) + if (chainBreakIndices.has(i)) break // 1) attachments the user explicitly added if (Array.isArray(m.attachments)) { ... } ... - // Stop if we hit a chain break (previous conversation topic) - if (chainBreakIndices.has(i)) break }
🧹 Nitpick comments (3)
server/lib/duckdb.ts (1)
240-249
: Simplify: Remove redundant ENOENT check.The
ENOENT
check in the catch block is redundant sinceunlinkWithRetry
already handlesENOENT
by returning early without throwing (line 22).Apply this diff to simplify the error handling:
try { await unlinkWithRetry(dbPath); Logger.debug(`Temporary DuckDB file deleted: ${dbPath}`); } catch (e: any) { - if (e?.code === 'ENOENT') { - Logger.debug(`Temporary DuckDB file already removed: ${dbPath}`); - } else { - Logger.warn(`Failed to delete temporary DuckDB file ${dbPath}:`, e); - } + Logger.warn(`Failed to delete temporary DuckDB file ${dbPath}:`, e); }server/api/chat/agents.ts (1)
4368-4376
: Clarify boolean precedence for file/image context checkAdd parentheses for readability (no behavior change).
- if (fileIds && fileIds.length > 0 || imageAttachmentFileIds && imageAttachmentFileIds.length > 0) { + if ((fileIds && fileIds.length > 0) || (imageAttachmentFileIds && imageAttachmentFileIds.length > 0)) {server/api/chat/chat.ts (1)
5187-5205
: Tighten condition and remove unnecessary casts
- Add parentheses for clarity.
- No need to cast fileIds/imageAttachmentFileIds to string[].
- if (fileIds && fileIds.length > 0 || imageAttachmentFileIds && imageAttachmentFileIds.length > 0) { + if ((fileIds && fileIds.length > 0) || (imageAttachmentFileIds && imageAttachmentFileIds.length > 0)) { loggerWithChild({ email: email }).info( `Follow-up query with file context detected. Using file-based context with NEW classification: ${JSON.stringify(classification)}, FileIds: ${JSON.stringify([fileIds, imageAttachmentFileIds])}`, ) iterator = UnderstandMessageAndAnswerForGivenContext( email, ctx, userMetadata, message, 0.5, - fileIds as string[], + fileIds, userRequestsReasoning, understandSpan, undefined, - imageAttachmentFileIds as string[], + imageAttachmentFileIds, agentPromptValue, undefined, actualModelId || config.defaultBestModel, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
server/api/chat/agents.ts
(4 hunks)server/api/chat/chat.ts
(5 hunks)server/api/chat/utils.ts
(5 hunks)server/lib/duckdb.ts
(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
server/api/chat/utils.ts (1)
server/db/schema/messages.ts (2)
messages
(31-78)SelectMessage
(88-88)
server/api/chat/chat.ts (3)
server/search/vespa.ts (1)
searchVespaInFiles
(117-117)server/api/chat/utils.ts (2)
transformMessagesWithErrorHandling
(1298-1343)collectFollowupContext
(80-152)server/db/schema/messages.ts (1)
messages
(31-78)
server/api/chat/agents.ts (4)
server/ai/types.ts (3)
TemporalClassifier
(279-282)QueryRouterResponse
(461-461)ConverseResponse
(311-323)server/api/chat/types.ts (1)
ImageCitation
(80-86)server/api/chat/utils.ts (2)
transformMessagesWithErrorHandling
(1298-1343)collectFollowupContext
(80-152)server/api/chat/chat.ts (2)
UnderstandMessageAndAnswerForGivenContext
(3881-3932)UnderstandMessageAndAnswer
(3744-3879)
🪛 GitHub Actions: TypeScript Build Check
server/api/chat/chat.ts
[error] 1990-1990: bunx tsc -b failed: Property 'attachmentRank' does not exist on type 'typeof SearchModes'.
🔇 Additional comments (5)
server/lib/duckdb.ts (2)
59-69
: LGTM! Robust stream completion pattern.The refactored flow properly waits for both the
end()
callback (which fires when 'finish' occurs) and the 'close' event, ensuring the stream is fully flushed. The error handler cleanup prevents potential memory leaks.
218-225
: LGTM! Defensive VIEW cleanup.Dropping the VIEW before closing the connection prevents potential resource leaks in DuckDB. The try-catch ensures cleanup continues even if the VIEW doesn't exist or can't be dropped.
server/api/chat/agents.ts (2)
4352-4356
: Good call: transform messages before follow-up carry-forwardUsing transformMessagesWithErrorHandling prior to collectFollowupContext prevents gaps when user errors lacked assistant replies. LGTM.
4295-4307
: Avoid excess property error on classification
- Extend
QueryRouterResponseSchema
withisFollowUp?: boolean
(preferred)- Or locally widen the type:
(TemporalClassifier & QueryRouterResponse) & { isFollowUp?: boolean }
server/api/chat/chat.ts (1)
5171-5185
: Follow-up carry-forward integration looks goodUsing the transformed history with collectFollowupContext and updating fileIds/imageAttachmentFileIds is sound.
Please confirm collectFollowupContext is intended to treat only image attachments as attachmentFileIds (utils.ts Lines 79–151 in provided snippet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/api/chat/chat.ts (1)
3402-3416
: Build error: remove unsupported 'selectedItem' from getItems params (or extend type)selectedItem is not part of GetItemsParams; tsc fails.
- searchResults = await getItems({ + searchResults = await getItems({ email, schema, app: agentApps ?? null, entity: entities ?? null, timestampRange, limit: userSpecifiedCountLimit + (classification.filters.offset || 0), offset: classification.filters.offset || 0, asc: sortDirection === "asc", intent: resolvedIntent || {}, channelIds, - selectedItem: selectedItem, collectionSelections: agentSpecificCollectionSelections, })- const getItemsParams = { + const getItemsParams = { email, schema, app: apps ?? null, entity: entities ?? null, timestampRange, limit: userSpecifiedCountLimit + (classification.filters.offset || 0), offset: classification.filters.offset || 0, asc: sortDirection === "asc", intent: resolvedIntent || {}, collectionSelections: agentSpecificCollectionSelections, - selectedItem: selectedItem, }If selectedItem is required, extend GetItemsParams in @xyne/vespa-ts/types accordingly.
Also applies to: 3425-3441
♻️ Duplicate comments (4)
server/api/chat/utils.ts (1)
131-140
: Guard against missing/invalid source.docId before pushingm.sources entries can lack a valid docId, causing "f:undefined" in seen and pushing undefined into ws.fileIds. Add a per‑item guard.
- if (Array.isArray(m.sources) && m.sources.length > 0 && ws.fileIds.length < MAX_FILES) { - for (const source of m.sources) { - if (!seen.has(`f:${source.docId}`)) { - ws.fileIds.push(source.docId); - ws.carriedFromMessageIds.push(m.externalId); - seen.add(`f:${source.docId}`); - if (ws.fileIds.length >= MAX_FILES) break; - } - } - } + if (Array.isArray(m.sources) && m.sources.length > 0 && ws.fileIds.length < MAX_FILES) { + for (const source of m.sources as any[]) { + const docId = (source as any)?.docId + if (typeof docId === "string" && docId && !seen.has(`f:${docId}`)) { + ws.fileIds.push(docId) + ws.carriedFromMessageIds.push(m.externalId) + seen.add(`f:${docId}`) + if (ws.fileIds.length >= MAX_FILES) break + } + } + }server/api/chat/agents.ts (1)
4351-4372
: Don’t overwrite existing file context—merge follow-up contextOverwriting
fileIds
andimageAttachmentFileIds
drops attachments from the current message and extracted IDs. Replace with a union of existing and carried IDs:- if (hasCarriedContext) { - fileIds = workingSet.fileIds; - imageAttachmentFileIds = workingSet.attachmentFileIds; + if (hasCarriedContext) { + fileIds = Array.from(new Set([...(fileIds || []), ...workingSet.fileIds])); + imageAttachmentFileIds = Array.from( + new Set([...(imageAttachmentFileIds || []), ...workingSet.attachmentFileIds]) + ); loggerWithChild({ email }).info( `Carried forward context from follow-up: ${JSON.stringify(workingSet)}`, ); }server/api/chat/chat.ts (2)
1975-2006
: Fix attachment/collection bucketing, remove brittle prefix checks, avoid shadowing, and replace invalid rankProfile
- att_ must be treated as attachments (legacy) not collection files.
- Non-collection must exclude both clf- and attachment prefixes.
- Local const attachmentFileIds shadows function param; rename.
- SearchModes.attachmentRank is invalid (build breaks).
Apply:
- const collectionFileIds = fileIds.filter((fid) => fid.startsWith("clf-") || fid.startsWith("att_")) - const nonCollectionFileIds = fileIds.filter((fid) => !fid.startsWith("clf-") && !fid.startsWith("att")) - const attachmentFileIds = fileIds.filter((fid) => fid.startsWith("attf_")) + const isCollectionId = (fid: string) => fid.startsWith("clf-") + const isAttachmentId = (fid: string) => fid.startsWith("attf_") || fid.startsWith("att_") // new + legacy + const collectionFileIds = fileIds.filter(isCollectionId) + const nonCollectionFileIds = fileIds.filter( + (fid) => !isCollectionId(fid) && !isAttachmentId(fid) + ) + const attachmentDocIds = fileIds.filter(isAttachmentId) @@ - if(attachmentFileIds && attachmentFileIds.length > 0) { - results = await searchVespaInFiles(builtUserQuery, email, attachmentFileIds, { - limit: fileIds?.length, - alpha: userAlpha, - rankProfile: SearchModes.attachmentRank, - }) + if (attachmentDocIds.length > 0) { + results = await searchVespaInFiles(builtUserQuery, email, attachmentDocIds, { + limit: fileIds?.length, + alpha: userAlpha, + // Prefer dedicated attachment profile if available; else fallback. + rankProfile: (SearchModes as any).AttachmentRank ?? SearchModes.NativeRank, + }) if (results.root.children) { combinedSearchResponse.push(...results.root.children) } }To confirm the enum member:
#!/bin/bash ast-grep --pattern $'enum $_ { $$$ }' | sed -n '1,120p' >/dev/null 2>&1 rg -nP 'SearchModes\.\w+' -C2 rg -nP '\bAttachmentRank\b' -C2
5217-5255
: Do not overwrite carried context; merge with current message contextReplacing fileIds/imageAttachmentFileIds drops context from the current message. Merge and dedupe.
- const workingSet = collectFollowupContext(filteredMessages); + const workingSet = collectFollowupContext(filteredMessages); @@ - if (hasCarriedContext) { - fileIds = workingSet.fileIds; - imageAttachmentFileIds = workingSet.attachmentFileIds; + if (hasCarriedContext) { + fileIds = Array.from(new Set([...(fileIds || []), ...workingSet.fileIds])); + imageAttachmentFileIds = Array.from( + new Set([...(imageAttachmentFileIds || []), ...workingSet.attachmentFileIds]) + ); loggerWithChild({ email: email }).info( `Carried forward context from follow-up: ${JSON.stringify(workingSet)}`, ); }
🧹 Nitpick comments (1)
server/api/chat/agents.ts (1)
4374-4393
: Add parentheses for clarity in mixed boolean conditionAvoid precedence confusion in (a && b || c && d).
- if (fileIds && fileIds.length > 0 || imageAttachmentFileIds && imageAttachmentFileIds.length > 0) { + if ((fileIds && fileIds.length > 0) || (imageAttachmentFileIds && imageAttachmentFileIds.length > 0)) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
server/api/chat/agents.ts
(3 hunks)server/api/chat/chat.ts
(4 hunks)server/api/chat/utils.ts
(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-16T11:56:22.752Z
Learnt from: oindrila-b
PR: xynehq/xyne#545
File: server/integrations/google/index.ts:137-145
Timestamp: 2025-06-16T11:56:22.752Z
Learning: In server/integrations/google/index.ts, both Logger (base logger) and loggerWithChild (factory for email-scoped child loggers) are intentionally maintained to handle different logging scenarios - one for when email context is not available and one for when it is available.
Applied to files:
server/api/chat/agents.ts
🧬 Code graph analysis (3)
server/api/chat/chat.ts (3)
server/search/vespa.ts (1)
searchVespaInFiles
(118-118)server/db/schema/messages.ts (1)
messages
(31-78)server/api/chat/utils.ts (1)
collectFollowupContext
(80-152)
server/api/chat/utils.ts (1)
server/db/schema/messages.ts (1)
messages
(31-78)
server/api/chat/agents.ts (4)
server/ai/types.ts (3)
TemporalClassifier
(279-282)QueryRouterResponse
(461-461)ConverseResponse
(311-323)server/api/chat/types.ts (1)
ImageCitation
(80-86)server/api/chat/utils.ts (1)
collectFollowupContext
(80-152)server/api/chat/chat.ts (2)
UnderstandMessageAndAnswerForGivenContext
(3919-3970)UnderstandMessageAndAnswer
(3782-3917)
🪛 GitHub Actions: TypeScript Build Check
server/api/chat/chat.ts
[error] 2001-2001: Property 'attachmentRank' does not exist on type 'typeof SearchModes'.
[error] 3412-3412: Object literal may only specify known properties, and 'selectedItem' does not exist in type 'Omit<GetItemsParams, "processedCollectionSelections"> & { collectionSelections?: { collectionIds?: string[] | undefined; collectionFolderIds?: string[] | undefined; collectionFileIds?: string[] | undefined; }[] | undefined; }'.
🔇 Additional comments (4)
server/api/chat/utils.ts (2)
143-145
: Confirm chain‑break boundary semanticsYou break after processing the chain‑break index. If the intent is to exclude the break message’s context, move the check before processing the loop body.
80-85
: Deriving start index internally looks goodStarting from the latest message by default is sensible for follow‑ups.
If any caller relied on a custom startIdx previously, confirm no regressions.
server/api/chat/agents.ts (1)
4139-4147
: Initialize parsed.isFollowUp default; verify upstream JSON shapeDefaulting isFollowUp to false is fine. Ensure jsonParseLLMOutput emits isFollowUp when applicable so classification remains accurate.
server/api/chat/chat.ts (1)
4733-4742
: Avoid history truncation bug with synthetic messages
Slicing bymessages.length - 1
can drop the wrong entries when synthetic messages are inserted. Instead, apply an error-handling transform and cut off at the last user message’sexternalId
:const transformed = transformMessagesWithErrorHandling(messages); const lastUser = messages[messages.length - 1]; const cutoff = transformed.findIndex(m => m.externalId === lastUser.externalId); const filteredMessages = (cutoff >= 0 ? transformed.slice(0, cutoff) : transformed.slice(0, transformed.length - 1) ) .filter(msg => !msg?.errorMessage) .filter(msg => !(msg.messageRole === MessageRole.Assistant && !msg.message));Ensure
transformMessagesWithErrorHandling
is defined and imported from the correct module.
corrected searchmode
Description
added follow up query context gathering for attached files
Summary by CodeRabbit