Replies: 4 comments 2 replies
-
🤖 Hey @aiwalter! Great to see you back with another intriguing question. Hope you're doing awesome! 🚀 Based on the current state of the LlamaIndex codebase, it does not inherently support user-specific indexing and searching. However, you can achieve this functionality by extending the existing classes and adding a user ID field to the index. This user ID field can then be used to filter search results to only include documents added by the same user. Here's a high-level overview of how you can achieve this:
class UserNode(BaseNode):
def __init__(self, node_id: str, user_id: str, ...): # Include other parameters as needed
super().__init__(node_id, ...)
self.user_id = user_id
class UserAzureAISearchVectorStore(AzureAISearchVectorStore):
def __init__(self, user_id_field_key: str, ...): # Include other parameters as needed
super().__init__(...)
self._field_mapping["user_id"] = user_id_field_key
def _create_index(self, index_name: Optional[str]) -> None:
super()._create_index(index_name)
self._index.fields.append(
SimpleField(name=self._field_mapping["user_id"], type="Edm.String", filterable=True)
)
def _create_index_document(self, node: UserNode) -> Dict[str, Any]:
doc = super()._create_index_document(node)
doc["user_id"] = node.user_id
return doc
node = UserNode(node_id="1", user_id="user1", ...) # Include user_id when creating the node
store.add([node])
filter = f'{store._field_mapping["user_id"]} eq \'user1\'' # Replace 'user1' with the actual user ID
results = store._search_client.search(search_text="*", filter=filter) Please note that this is a high-level overview and you may need to adjust the code to fit your specific needs. Also, remember to replace For more information on how to use Azure AI Search with the LlamaIndex, you can refer to the AzureAISearchVectorStore class in the LlamaIndex repository. Sources
|
Beta Was this translation helpful? Give feedback.
-
🤖 Sure, you can achieve this functionality without creating new classes by modifying the Firstly, you can add a Here is the modified def _create_index_document(self, node: BaseNode, user_id: str) -> Dict[str, Any]:
"""Create AI Search index document from embedding result."""
doc: Dict[str, Any] = {}
doc["id"] = node.node_id
doc["chunk"] = node.get_content(metadata_mode=MetadataMode.NONE) or ""
doc["embedding"] = node.get_embedding()
doc["doc_id"] = node.ref_doc_id
doc["user_id"] = user_id # Add user_id to the document
node_metadata = node_to_metadata_dict(
node,
remove_text=True,
flat_metadata=self.flat_metadata,
)
doc["metadata"] = json.dumps(node_metadata)
return self._index_mapping(doc, node_metadata) And here is the modified def add(
self,
nodes: List[BaseNode],
user_id: str, # Add user_id as a parameter
**add_kwargs: Any,
) -> List[str]:
"""Add nodes to index associated with the configured search client.
Args:
nodes: List[BaseNode]: nodes with embeddings
user_id: str: User ID
"""
if not self._search_client:
raise ValueError("Search client not initialized")
documents = []
ids = []
for node in nodes:
logger.debug(f"Processing embedding: {node.node_id}")
ids.append(node.node_id)
index_document = self._create_index_document(node, user_id) # Pass user_id to the method
documents.append(index_document)
if len(documents) >= 10:
logger.info(
f"Uploading batch of size {len(documents)}, "
f"current progress {len(ids)} of {len(nodes)}"
)
self._search_client.merge_or_upload_documents(documents)
documents = []
# Upload remaining batch of less than 10 documents
if len(documents) > 0:
logger.info(
f"Uploading remaining batch of size {len(documents)}, "
f"current progress {len(ids)} of {len(nodes)}"
)
self._search_client.merge_or_upload_documents(documents)
documents = []
return ids To search documents for a specific user, you can add a def search(self, user_id: str, search_text: str) -> List[Dict[str, Any]]:
"""Search documents for a specific user.
Args:
user_id: str: User ID
search_text: str: Search text
Returns:
List of documents that match the search text and belong to the user.
"""
filter = f'user_id eq \'{user_id}\'' # Filter documents based on user_id
results = self._search_client.search(search_text=search_text, filter=filter)
return [result for result in results] Please note that you need to add the Sources
|
Beta Was this translation helpful? Give feedback.
-
I am thinking to use |
Beta Was this translation helpful? Give feedback.
-
🤖 Yes, it is feasible to use Here is an example of how you can do this: from llama_index.vector_stores.azurecosmosmongo import AzureCosmosDBMongoDBVectorSearch
# Initialize the AzureCosmosDBMongoDBVectorSearch with the user ID as the collection name
user_id = "user123"
vector_search = AzureCosmosDBMongoDBVectorSearch(collection_name=user_id) As for Azure AI Search, it does not directly support the same functionality as Regarding potential performance or security implications when setting the
Therefore, while it's technically possible to use the user ID as the collection name, it's important to consider these potential implications. Sources
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I would like to use Azure AI Search for different users. It would still be same index but each user can only add and search documents that were provided by this particular user given e.g a user ID. How to implement this? @dosubot please provide an example
Beta Was this translation helpful? Give feedback.
All reactions