Skip to content

Conversation

Himanshvarma
Copy link
Contributor

@Himanshvarma Himanshvarma commented Sep 19, 2025

Description

Testing

Additional Notes

Summary by CodeRabbit

  • New Features
    • Responses are now date-aware, more structured, and include clearer citations. If no relevant context is found, the assistant will explicitly return no answer.
    • File uploads no longer extract images by default, making processing faster; image extraction can be enabled when needed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 19, 2025

Walkthrough

Introduces structured prompt changes in server/ai/agentPrompts.ts with dynamic date context, renamed sections, explicit JSON response schema, and null-answer fallback logic. Updates server/services/fileProcessor.ts to change processFile’s extractImages default from true to false without altering control flow.

Changes

Cohort / File(s) Summary
AI prompt restructuring
server/ai/agentPrompts.ts
Adds dynamic date context; broadens data model from single file to multiple data types; renames sections (Chunk Context, Retrieved Context); enforces JSON response schema; mandates null-answer fallback; updates notes and error handling text.
File processing defaults
server/services/fileProcessor.ts
Changes FileProcessorService.processFile signature default: extractImages true → false; no other logic changes; downstream behavior now skips image extraction unless explicitly enabled.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant FPS as FileProcessorService
  participant IE as ImageExtractor
  participant IS as ImageSummarizer

  Client->>FPS: processFile(buffer, mimeType, fileName, vespaDocId, storagePath?, extractImages=?, describeImages=?)
  note right of FPS: Default extractImages = false (changed)

  alt extractImages == true
    FPS->>IE: extract(buffer, mimeType)
    IE-->>FPS: images[]
    alt describeImages == true
      FPS->>IS: describe(images[])
      IS-->>FPS: descriptions[]
    else describeImages == false
      note over FPS: Skip image descriptions
    end
  else extractImages == false
    note over FPS: Skip image extraction
  end

  FPS-->>Client: ProcessingResult
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • junaid-shirur
  • zereraz
  • shivamashtikar
  • kalpadhwaryu

Poem

Thump-thump, I tweak with nimble paws,
Prompts now speak in JSON laws.
Dates hop in, citations neat—
If context’s bare, we won’t compete.
Images nap unless you ask;
I munch on docs—a tasty task.
Hop approved? Carrot, please. 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "fix: document chat prompt is fixed" directly references the prompt-related changes in server/ai/agentPrompts.ts and therefore reflects the primary change in the diff; however the phrasing is redundant ("fix" and "is fixed") and it does not mention the secondary change to the fileProcessor image-extraction default. Overall the title is related and specific enough to identify the main intent of the PR.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/documentChatPrompt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Himanshvarma, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on improving the reliability and clarity of the AI agent's document chat functionality by refining its underlying prompt. It provides the AI with more structured guidance on how to interpret context, format responses, and manage cases where information is not found. Additionally, a minor adjustment was made to the default image extraction setting in the file processing service to optimize its behavior.

Highlights

  • AI Agent Prompt Enhancement: The core AI agent prompt for document chat has been significantly updated to provide more precise instructions for response generation, context interpretation, and citation rules. This includes adding the current date, a detailed file context format, and explicit guidelines for handling sensitive questions and maintaining a professional tone.
  • Explicit JSON Response Structure: The prompt now clearly defines the expected JSON response format, including a specific structure for when no relevant information is found, ensuring consistent output from the AI agent.
  • Refined Error Handling Instructions: Instructions for the AI agent on how to handle missing or unclear information, or queries lacking context, have been refined to ensure the 'answer' field is set to null without additional explanation.
  • File Processing Default Change: The default behavior for image extraction during file processing has been changed from enabled to disabled, impacting how files are handled by the FileProcessorService.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the agentBaselineFileContextPromptJson prompt to be more detailed and structured, which is a good improvement. However, it also changes the default behavior of file processing to disable image extraction, which could have unintended side effects. My review includes a high-severity comment on this change and a medium-severity comment to improve the clarity of the updated prompt by removing repetitive instructions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
server/services/fileProcessor.ts (1)

29-31: Alternative: keep backward-compat default, allow opt-out via env

Safer default preserves behavior while enabling a controlled switch.

-    extractImages: boolean = false,
+    extractImages: boolean = process.env.FILEPROC_EXTRACT_IMAGES_DEFAULT === "false" ? false : true,

Document FILEPROC_EXTRACT_IMAGES_DEFAULT and plan a staged flip.

server/ai/agentPrompts.ts (1)

859-861: Minor: duplicated “current date” phrasing

getDateForAI() already includes “Current Date : …”. Prepending “The current date is:” yields clunky duplication.

-) => `The current date is: ${getDateForAI()}. Based on this information, make your answers. Don't try to give vague answers without
+) => `${getDateForAI()}. Based on this information, make your answers. Don't try to give vague answers without
 any logic. Be formal as much as possible.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce7c8e3 and 11a3062.

📒 Files selected for processing (2)
  • server/ai/agentPrompts.ts (2 hunks)
  • server/services/fileProcessor.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/ai/agentPrompts.ts (2)
server/utils/index.ts (1)
  • getDateForAI (3-24)
server/ai/context.ts (1)
  • userContext (832-846)
🔇 Additional comments (3)
server/ai/agentPrompts.ts (3)

862-875: LGTM: clearer single-file context and richer file metadata

The SINGLE-file emphasis, added ID/mime/size/timestamps, and explicit chunk-index rules improve determinism for citations.


889-905: LGTM: strict chunk-level citation rules

The 1–2 citations per sentence, no grouped indices, and per-chunk grounding help reduce hallucinations.


917-921: LGTM: explicit null-answer fallback

Clear null contract when context doesn’t match will simplify handlers and avoid speculative answers.

@shivamashtikar shivamashtikar merged commit ab41964 into main Sep 19, 2025
4 checks passed
Himanshvarma added a commit that referenced this pull request Sep 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants