storing sensitive information on Vector DB #21873
vishyarjun
started this conversation in
General
Replies: 1 comment
-
Hi @vishyarjun, Here is something very cool about data anonymization using a local model : https://python.langchain.com/v0.1/docs/guides/productionization/safety/presidio_data_anonymization/ I think that could help you |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi All,
We have recently asked by our customers not to store the customer data in vector databases as they are already stored in RDBMS.
As of now we are using vector databases to store distinct cardinality columns that can enhance the search. we are using retreiver tool like below
retriever_tool = create_retriever_tool( retriever, name="search_proper_nouns", description=description, )
by not storing the proper nouns, we might be losing accuracy. what should we do in this situation?
Also another challenge is that this data is growing, do we have to embed data in a schedule to update the new data that gets added everyday.
Cheers,
Arjun.
Beta Was this translation helpful? Give feedback.
All reactions