Skip to content

Commit 568d90a

Browse files
Update mongodb.adoc (#1181)
1 parent aeba9c2 commit 568d90a

File tree

1 file changed

+184
-36
lines changed
  • spring-ai-docs/src/main/antora/modules/ROOT/pages/api/vectordbs

1 file changed

+184
-36
lines changed
Lines changed: 184 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,33 @@
11
= MongoDB Atlas
22

3-
This section walks you through setting up MongoDB Atlas as a vector store with Spring AI.
3+
4+
This section walks you through setting up MongoDB Atlas as a vector store to use with Spring AI.
5+
46

57
== What is MongoDB Atlas?
68

7-
https://www.mongodb.com/products/platform/atlas-database[MongoDB Atlas] is a fully-managed cloud database available in AWS, Azure, and GCP.
8-
It supports native Vector Search and full text search (`BM25`) on your MongoDB document data.
99

10-
https://www.mongodb.com/products/platform/atlas-vector-search[MongoDB Atlas Vector Search] allows to store your embeddings in MongoDB documents, create a vector search index, and perform KNN search with an approximate nearest neighbor algorithm (Hierarchical Navigable Small Worlds).
11-
It uses the `$vectorSearch` aggregation stage.
10+
https://www.mongodb.com/products/platform/atlas-database[MongoDB Atlas] is the fully-managed cloud database from MongoDB available in AWS, Azure, and GCP.
11+
Atlas supports native Vector Search and full text search on your MongoDB document data.
12+
13+
14+
https://www.mongodb.com/products/platform/atlas-vector-search[MongoDB Atlas Vector Search] allows you to store your embeddings in MongoDB documents, create vector search indexes, and perform KNN searches with an approximate nearest neighbor algorithm (Hierarchical Navigable Small Worlds).
15+
You can use the `$vectorSearch` aggregation operator in a MongoDB aggregation stage to perform a search on your vector embeddings.
16+
1217

1318
== Prerequisites
1419

15-
TODO: Add prerequisites instructions
1620

17-
== Auto-configuration
21+
- An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. To get started with MongoDB Atlas, you can follow the instructions https://www.mongodb.com/docs/atlas/getting-started/[here]. Ensure that your IP address is included in your Atlas project’s https://www.mongodb.com/docs/atlas/security/ip-access-list/#std-label-access-list[access list].
1822

23+
24+
- An `EmbeddingModel` instance to compute the document embeddings. Several options are available. Refer to the https://docs.spring.io/spring-ai/reference/api/embeddings.html#available-implementations[EmbeddingModel] section for more information.
25+
26+
27+
- An environment to set up and run a Java application.
28+
29+
30+
== Auto-configuration
1931
Spring AI provides Spring Boot auto-configuration for the MongoDB Atlas Vector Store.
2032
To enable it, add the following dependency to your project's Maven `pom.xml` file:
2133

@@ -36,19 +48,51 @@ dependencies {
3648
}
3749
----
3850

51+
The vector store implementation can initialize the requisite schema for you, but you must opt-in by specifying the `initializeSchema` boolean in the appropriate constructor or by setting `...initialize-schema=true` in the `application.properties` file.
52+
53+
3954
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
4055

4156
TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file.
4257

4358

44-
The vector store implementation can initialize the requisite schema for you, but you must opt-in by specifying the `initializeSchema` boolean in the appropriate constructor or by setting `...initialize-schema=true` in the `application.properties` file.
59+
=== Schema Initialization
60+
The vector store implementation can initialize the requisite schema for you, but you must opt-in by specifying the `initializeSchema` boolean in the appropriate constructor or by setting `spring.ai.vectorstore.mongodb.initialize-schema=true` in the `application.properties` file.
61+
4562

4663
NOTE: this is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.
4764

4865

66+
When `initializeSchema` is set to `true`, the following actions are performed automatically:
67+
4968

69+
- **Collection Creation**: The specified collection for storing vectors will be created if it does not already exist.
70+
- **Search Index Creation**: A search index will be created based on the configuration properties.
5071

5172

73+
If you're running a free or shared tier cluster, you must separately create the index through the Atlas UI, Atlas Administration API, or Atlas CLI.
74+
75+
76+
NOTE: If you have an existing Atlas Vector Search index called `vector_index` on the `springai_test.vector_store collection`, Spring AI won't create an additional index. Because of this, you might experience errors later if the existing index was configured with incompatible settings, such as a different number of dimensions.
77+
78+
79+
Ensure that your index has the following configuration:
80+
81+
82+
[source,json]
83+
----
84+
{
85+
"fields": [
86+
{
87+
"numDimensions": 1536,
88+
"path": "embedding",
89+
"similarity": "cosine",
90+
"type": "vector"
91+
}
92+
]
93+
}
94+
----
95+
5296

5397
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
5498

@@ -63,48 +107,152 @@ public EmbeddingModel embeddingModel() {
63107
}
64108
----
65109

66-
== Metadata filtering
67110

68-
You can leverage the generic, portable xref:api/vectordbs.adoc#metadata-filters[metadata filters] with MongoDB Atlas store as well.
111+
=== Configuration properties
112+
You can use the following properties in your Spring Boot configuration to customize the MongoDB Atlas vector store.
113+
[source,xml]
114+
----
115+
...
116+
spring.data.mongodb.uri=<connection string>
117+
spring.data.mongodb.database=<database name>
118+
119+
spring.ai.vectorstore.mongodb.collection-name=vector_store
120+
spring.ai.vectorstore.mongodb.initialize-schema=true
121+
spring.ai.vectorstore.mongodb.path-name=embedding
122+
spring.ai.vectorstore.mongodb.indexName=vector_index
123+
----
124+
125+
126+
|===
127+
|Property| Description | Default value
69128

70-
For example, you can use either the text expression language:
71129

130+
|`spring.ai.vectorstore.mongodb.collection-name`| The name of the collection to store the vectors. | `vector_store`
131+
|`spring.ai.vectorstore.mongodb.initialize-schema`| whether to initialize the backend schema for you | `false`
132+
|`spring.ai.vectorstore.mongodb.path-name`| The name of the path to store the vectors. | `embedding`
133+
|`spring.ai.vectorstore.mongodb.indexName`| The name of the index to store the vectors. | `vector_index`
134+
|===
135+
136+
137+
== Manual Configuration Properties
138+
If you prefer to manually configure the MongoDB Atlas vector store without auto-configuration, you can do so by directly setting up the `MongoDBAtlasVectorStore` and its dependencies.
139+
140+
141+
=== Example Configuration
72142
[source,java]
73143
----
74-
vectorStore.similaritySearch(
75-
SearchRequest.defaults()
76-
.withQuery("The World")
77-
.withTopK(TOP_K)
78-
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
79-
.withFilterExpression("author in ['john', 'jill'] && 'article_type' == 'blog'"));
144+
@Configuration
145+
public class VectorStoreConfig {
146+
147+
@Bean
148+
public MongoDBAtlasVectorStore vectorStore(MongoTemplate mongoTemplate, EmbeddingModel embeddingModel) {
149+
MongoDBVectorStoreConfig config = MongoDBVectorStoreConfig.builder()
150+
.withCollectionName("custom_vector_store")
151+
.withVectorIndexName("custom_vector_index")
152+
.withPathName("custom_embedding_path")
153+
.withMetadataFieldsToFilter(List.of("author", "year"))
154+
.build();
155+
156+
return new MongoDBAtlasVectorStore(mongoTemplate, embeddingModel, config, true);
157+
}
158+
}
80159
----
160+
=== Properties
161+
- `collectionName`: The name of the collection to store the vectors.
162+
- `vectorIndexName`: The name of the vector index.
163+
- `pathName`: The path where vectors are stored.
164+
- `metadataFieldsToFilter`: A list of metadata fields to filter.
165+
166+
167+
You can enable schema initialization by passing `true` as the last parameter in the `MongoDBAtlasVectorStore` constructor
168+
169+
170+
== Adding Documents
171+
To add documents to the vector store, you need to convert your input documents into the `Document` type and call the `addDocuments()` method. This method will use the `EmbeddingModel` to compute the embeddings and save them to the MongoDB collection.
81172

82-
or programmatically using the `Filter.Expression` DSL:
83173

84174
[source,java]
85175
----
86-
FilterExpressionBuilder b = new FilterExpressionBuilder();
176+
List<Document> docs = List.of(
177+
new Document("Proper tuber planting involves site selection, timing, and care. Choose well-drained soil and adequate sun exposure. Plant in spring, with eyes facing upward at a depth two to three times the tuber's height. Ensure 4-12 inch spacing based on tuber size. Adequate moisture is needed, but avoid overwatering. Mulching helps preserve moisture and prevent weeds.", Map.of("author", "A", "type", "post")),
178+
new Document("Successful oil painting requires patience, proper equipment, and technique. Prepare a primed canvas, sketch lightly, and use high-quality brushes and oils. Paint 'fat over lean' to prevent cracking. Allow each layer to dry before applying the next. Clean brushes often and work in a well-ventilated space.", Map.of("author", "A")),
179+
new Document("For a natural lawn, select the right grass type for your climate. Water 1 to 1.5 inches per week, avoid overwatering, and use organic fertilizers. Regular aeration helps root growth and prevents compaction. Practice natural pest control and overseeding to maintain a dense lawn.", Map.of("author", "B", "type", "post")) );
180+
181+
vectorStore.add(docs);
182+
183+
184+
185+
186+
87187
88-
vectorStore.similaritySearch(SearchRequest.defaults()
89-
.withQuery("The World")
90-
.withTopK(TOP_K)
91-
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
92-
.withFilterExpression(b.and(
93-
b.in("author", "john", "jill"),
94-
b.eq("article_type", "blog")).build()));
95188
----
96189

97-
NOTE: Those (portable) filter expressions get automatically converted into the proprietary MongoDB Atlas filter expressions.
98190

99-
== MongoDB Atlas properties
191+
== Deleting Documents
192+
To delete documents from the vector store, use the `delete()` method. This method takes a list of document IDs and removes the corresponding documents from the MongoDB collection.
100193

101-
You can use the following properties in your Spring Boot configuration to customize the MongoDB Atlas vector store.
102194

103-
|===
104-
|Property| Description | Default value
195+
[source,java]
196+
----
197+
List<String> ids = List.of("id1", "id2", "id3"); // Replace with actual document IDs
105198
106-
|`spring.ai.vectorstore.mongodb.collection-name`| The name of the collection to store the vectors. | `vector_store`
107-
|`spring.ai.vectorstore.mongodb.initialize-schema`| whether to initialize the backend schema for you | `false`
108-
|`spring.ai.vectorstore.mongodb.path-name`| The name of the path to store the vectors. | `embedding`
109-
|`spring.ai.vectorstore.mongodb.indexName`| The name of the index to store the vectors. | `vector_index`
110-
|===
199+
vectorStore.delete(ids);
200+
----
201+
202+
203+
== Performing Similarity Search
204+
To perform a similarity search, construct a `SearchRequest` object with the desired query parameters and call the `similaritySearch()` method. This method will return a list of documents that match the query based on vector similarity.
205+
206+
207+
[source,java]
208+
----
209+
List<Document> results = vectorStore.similaritySearch(
210+
SearchRequest
211+
.query("learn how to grow things")
212+
.withTopK(2)
213+
);
214+
----
215+
216+
217+
== Metadata Filtering
218+
Metadata filtering allows for more refined queries by filtering results based on specified metadata fields. This feature uses the MongoDB Query API to perform filtering operations in conjunction with vector searches.
219+
220+
221+
=== Filter Expressions
222+
The `MongoDBAtlasFilterExpressionConverter` class converts filter expressions into MongoDB Atlas metadata filter expressions. The supported operations include:
223+
224+
225+
- `$and`
226+
- `$or`
227+
- `$eq`
228+
- `$ne`
229+
- `$lt`
230+
- `$lte`
231+
- `$gt`
232+
- `$gte`
233+
- `$in`
234+
- `$nin`
235+
236+
237+
These operations enable filtering logic to be applied to metadata fields associated with documents in the vector store.
238+
239+
240+
=== Example of a Filter Expression
241+
Here’s an example of how to use a filter expression in a similarity search:
242+
243+
244+
[source,java]
245+
----
246+
FilterExpressionBuilder b = new FilterExpressionBuilder();
247+
248+
List<Document> results = vectorStore.similaritySearch(
249+
SearchRequest.defaults()
250+
.withQuery("learn how to grow things")
251+
.withTopK(2)
252+
.withSimilarityThreshold(0.5)
253+
.withFilterExpression(b.eq("author", "A").build())
254+
);
255+
----
256+
257+
258+
If you would like to try out Spring AI with MongoDB, see https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/spring-ai/#std-label-spring-ai[Get Started with the Spring AI Integration].

0 commit comments

Comments
 (0)