Skip to content

feat: Generate and add image captions to search index when image is ingested. #928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 16, 2024

Conversation

superhindupur
Copy link
Contributor

@superhindupur superhindupur commented May 16, 2024

Closes #749

Purpose

This PR adds the following functionality when an image is uploaded to the knowledge base (when advanced image processing is enabled):

  • generates a caption for the image by calling the OpenAI service
  • generates the vector embeddings for the caption
  • includes the caption text and embeddings in the document uploaded to the search index

This will allow for improved search on images.

Does this introduce a breaking change?

  • Yes
  • No

How to Test

Deploy Chat with your data with Advanced Image Processing enabled.
Upload an image via the admin site.
Then verify on the search index that the content and content_vector fields are populated with the appropriate caption and vector embeddings.

Copy link

github-actions bot commented May 16, 2024

Coverage

Coverage Report •
FileStmtsMissCoverMissing
code/backend/batch/utilities/helpers
   env_helper.py1341092%222, 227–228, 231–233, 245, 249–251
   llm_helper.py421173%40–41, 50, 61–62, 73, 86–87, 94, 113, 121
code/backend/batch/utilities/helpers/embedders
   push_embedder.py740100% 
TOTAL243468471% 

Tests Skipped Failures Errors Time
202 0 💤 0 ❌ 0 🔥 11.817s ⏱️

@cecheta cecheta changed the title Generate and add image captions to search index when image is ingested. feat: Generate and add image captions to search index when image is ingested. May 16, 2024
@cecheta cecheta force-pushed the feat/748/image-captions branch 2 times, most recently from 6125987 to 3c5ff38 Compare May 16, 2024 15:06
@cecheta cecheta force-pushed the feat/748/image-captions branch from 3c5ff38 to 27b35d2 Compare May 16, 2024 15:07
@cecheta cecheta requested a review from adamdougal May 16, 2024 15:09
adamdougal
adamdougal previously approved these changes May 16, 2024
Copy link
Collaborator

@adamdougal adamdougal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

ross-p-smith
ross-p-smith previously approved these changes May 16, 2024
@cecheta cecheta dismissed stale reviews from ross-p-smith and adamdougal via 3af5227 May 16, 2024 15:31
@cecheta cecheta added this pull request to the merge queue May 16, 2024
Merged via the queue into main with commit b8e34aa May 16, 2024
12 checks passed
@cecheta cecheta deleted the feat/748/image-captions branch May 16, 2024 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate and store embeddings for descriptions of images in search index
4 participants