Replies: 1 comment 3 replies
-
To achieve your goal of storing the same file content multiple times under different collection names without creating duplicates, you should leverage the Here's how you can adjust your implementation:
For example: # Assuming `engine` is your SQLAlchemy engine
record_manager_employee1 = SQLRecordManager(namespace="employee1", engine=engine)
record_manager_employee2 = SQLRecordManager(namespace="employee2", engine=engine)
# Upsert the same key into different collections (namespaces)
key = "same_file_key"
record_manager_employee1.update([key])
record_manager_employee2.update([key]) Replace
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
To avoid duplicate entries in the PG vector database, I have started using the SQLRecordManager module from the indexes library. My goal is to avoid storing the same file content multiple times with collection name but want to allow the same file (source) and its content to exist with different collection names.
In my use case, the collection name represents an employee's name. Therefore, I create collections based on employee names. Employees will be using the same S3 bucket and loading identical files to test different LLM responses in an RAG approach.
However, with the current implementation, entries are being deleted from the table if the file name (source in the cmetadata column) is the same. I want to keep the same files if the collection names are different.
Is it possible to achieve this? If so, what changes do I need to make?
I want like this


System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found:
Beta Was this translation helpful? Give feedback.
All reactions