|
2 | 2 |
|
3 | 3 | This section will walk you through setting up the Chroma VectorStore to store document embeddings and perform similarity searches.
|
4 | 4 |
|
5 |
| -link:https://github.com/chroma-core/chroma/pkgs/container/chroma[Chroma Container] |
6 |
| - |
7 |
| -== What is Chroma? |
8 |
| - |
9 | 5 | link:https://docs.trychroma.com/[Chroma] is the open-source embedding database. It gives you the tools to store document embeddings, content, and metadata and to search through those embeddings, including metadata filtering.
|
10 | 6 |
|
11 |
| -=== Prerequisites |
| 7 | +== Prerequisites |
12 | 8 |
|
13 |
| -1. OpenAI Account: Create an account at link:https://platform.openai.com/signup[OpenAI Signup] and generate the token at link:https://platform.openai.com/account/api-keys[API Keys]. |
| 9 | +1. Access to ChromeDB. The <<Run Chroma Locally, setup local ChromaDB>> appendix shows how to set up a DB locally with a Docker container. |
14 | 10 |
|
15 |
| -2. Access to ChromeDB. The <<Run Chroma Locally, setup local ChromaDB>> appendix shows how to set up a DB locally with a Docker container. |
| 11 | +2. `EmbeddingClient` instance to compute the document embeddings. Several options are available: |
| 12 | +- If required, an API key for the xref:api/embeddings.adoc#available-implementations[EmbeddingClient] to generate the embeddings stored by the `ChromaVectorStore`. |
16 | 13 |
|
17 | 14 | On startup, the `ChromaVectorStore` creates the required collection if one is not provisioned already.
|
18 | 15 |
|
19 |
| -== Configuration |
20 |
| - |
21 |
| -To set up ChromaVectorStore, you'll need to provide your OpenAI API Key. Set it as an environment variable like so: |
22 |
| - |
23 |
| -[source,bash] |
24 |
| ----- |
25 |
| -export SPRING_AI_OPENAI_API_KEY='Your_OpenAI_API_Key' |
26 |
| ----- |
27 |
| - |
28 |
| -== Dependencies |
| 16 | +== Auto-configuration |
29 | 17 |
|
30 |
| -Add these dependencies to your project: |
| 18 | +Spring AI provides Spring Boot auto-configuration for the Chroma Vector Sore. |
| 19 | +To enable it, add the following dependency to your project's Maven `pom.xml` file: |
31 | 20 |
|
32 |
| -* OpenAI: Required for calculating embeddings. |
33 |
| - |
34 |
| -[source,xml] |
| 21 | +[source, xml] |
35 | 22 | ----
|
36 | 23 | <dependency>
|
37 |
| - <groupId>org.springframework.ai</groupId> |
38 |
| - <artifactId>spring-ai-openai-spring-boot-starter</artifactId> |
| 24 | + <groupId>org.springframework.ai</groupId> |
| 25 | + <artifactId>spring-ai-chroma-store-spring-boot-starter</artifactId> |
39 | 26 | </dependency>
|
40 | 27 | ----
|
41 | 28 |
|
42 |
| -* Chroma VectorStore. |
| 29 | +or to your Gradle `build.gradle` build file. |
43 | 30 |
|
44 |
| -[source,xml] |
| 31 | +[source,groovy] |
45 | 32 | ----
|
46 |
| -<dependency> |
47 |
| - <groupId>org.springframework.ai</groupId> |
48 |
| - <artifactId>spring-ai-chroma-store</artifactId> |
49 |
| -</dependency> |
| 33 | +dependencies { |
| 34 | + implementation 'org.springframework.ai:spring-ai-chroma-store-spring-boot-starter' |
| 35 | +} |
50 | 36 | ----
|
51 | 37 |
|
52 | 38 | TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
|
53 | 39 |
|
54 |
| -== Sample Code |
| 40 | +TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file. |
55 | 41 |
|
56 |
| -Create a `RestTemplate` instance with proper ChromaDB authorization configurations and Use it to create a `ChromaApi` instance: |
| 42 | +Additionally, you will need a configured `EmbeddingClient` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingClient] section for more information. |
| 43 | + |
| 44 | +Here is an example of the needed bean: |
57 | 45 |
|
58 | 46 | [source,java]
|
59 | 47 | ----
|
60 | 48 | @Bean
|
61 |
| -public RestTemplate restTemplate() { |
62 |
| - return new RestTemplate(); |
63 |
| -} |
64 |
| -
|
65 |
| -@Bean |
66 |
| -public ChromaApi chromaApi(RestTemplate restTemplate) { |
67 |
| - String chromaUrl = "http://localhost:8000"; |
68 |
| - ChromaApi chromaApi = new ChromaApi(chromaUrl, restTemplate); |
69 |
| - return chromaApi; |
| 49 | +public EmbeddingClient embeddingClient() { |
| 50 | + // Can be any other EmbeddingClient implementation. |
| 51 | + return new OpenAiEmbeddingClient(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY"))); |
70 | 52 | }
|
71 | 53 | ----
|
72 | 54 |
|
73 |
| -[NOTE] |
74 |
| -==== |
75 |
| -For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#static-api-token-authentication[Static API Token Authentication] use the `ChromaApi#withKeyToken(<Your Token Credentials>)` method to set your credentials. Check the `ChromaWhereIT` for an example. |
| 55 | +To connect to Chroma you need to provide access details for your instance. |
| 56 | +A simple configuration can either be provided via Spring Boot's _application.properties_, |
76 | 57 |
|
77 |
| -For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#basic-authentication[Basic Authentication] use the `ChromaApi#withBasicAuth(<your user>, <your password>)` method to set your credentials. Check the `BasicAuthChromaWhereIT` for an example. |
78 |
| -==== |
| 58 | +[source,properties] |
| 59 | +---- |
| 60 | +# Chroma Vector Store connection properties |
| 61 | +spring.ai.vectorstore.chroma.client.host=<your Chroma instance host> |
| 62 | +spring.ai.vectorstore.chroma.client.port=<your Chroma instance port> |
| 63 | +spring.ai.vectorstore.chroma.client.key-token=<your access token (if configure)> |
| 64 | +spring.ai.vectorstore.chroma.client.username=<your username (if configure)> |
| 65 | +spring.ai.vectorstore.chroma.client.password=<your password (if configure)> |
79 | 66 |
|
80 |
| -Integrate with OpenAI's embeddings by adding the Spring Boot OpenAI starter to your project. This provides you with an implementation of the Embeddings client: |
| 67 | +# Chroma Vector Store collection properties |
| 68 | +spring.ai.vectorstore.chroma.store.collection-name=<your collection name> |
81 | 69 |
|
82 |
| -[source,java] |
83 |
| ----- |
84 |
| -@Bean |
85 |
| -public VectorStore chromaVectorStore(EmbeddingClient embeddingClient, ChromaApi chromaApi) { |
86 |
| - return new ChromaVectorStore(embeddingClient, chromaApi, "TestCollection"); |
87 |
| -} |
| 70 | +# Chroma Vector Store configuration properties |
| 71 | +
|
| 72 | +# OpenAI API key if the OpenAI auto-configuration is used. |
| 73 | +spring.ai.openai.api.key=<OpenAI Api-key> |
88 | 74 | ----
|
89 | 75 |
|
90 |
| -In your main code, create some documents: |
| 76 | +Please have a look at the list of xref:#_configuration_properties[configuration parameters] for the vector store to learn about the default values and configuration options. |
| 77 | + |
| 78 | +Now you can Auto-wire the Chroma Vector Store in your application and use it |
91 | 79 |
|
92 | 80 | [source,java]
|
93 | 81 | ----
|
94 |
| -List<Document> documents = List.of( |
95 |
| - new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")), |
96 |
| - new Document("The World is Big and Salvation Lurks Around the Corner"), |
97 |
| - new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))); |
98 |
| ----- |
| 82 | +@Autowired VectorStore vectorStore; |
99 | 83 |
|
100 |
| -Add the documents to your vector store: |
| 84 | +// ... |
101 | 85 |
|
102 |
| -[source,java] |
103 |
| ----- |
104 |
| -vectorStore.add(documents); |
105 |
| ----- |
| 86 | +List <Document> documents = List.of( |
| 87 | + new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")), |
| 88 | + new Document("The World is Big and Salvation Lurks Around the Corner"), |
| 89 | + new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))); |
106 | 90 |
|
107 |
| -And finally, retrieve documents similar to a query: |
| 91 | +// Add the documents |
| 92 | +vectorStore.add(List.of(document)); |
108 | 93 |
|
109 |
| -[source,java] |
110 |
| ----- |
111 |
| -List<Document> results = vectorStore.similaritySearch("Spring"); |
| 94 | +// Retrieve documents similar to a query |
| 95 | +List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5)); |
112 | 96 | ----
|
113 | 97 |
|
114 |
| -If all goes well, you should retrieve the document containing the text "Spring AI rocks!!". |
| 98 | +=== Configuration properties |
| 99 | + |
| 100 | +You can use the following properties in your Spring Boot configuration to customize the vector store. |
115 | 101 |
|
116 |
| -=== Metadata filtering |
| 102 | +|=== |
| 103 | +|Property| Description | Default value |
| 104 | + |
| 105 | +|`spring.ai.vectorstore.chroma.client.host`| Server connection host | `http://localhost` |
| 106 | +|`spring.ai.vectorstore.chroma.client.port`| Server connection port | `8000` |
| 107 | +|`spring.ai.vectorstore.chroma.client.key-token`| Access token (if configured) | - |
| 108 | +|`spring.ai.vectorstore.chroma.client.username`| Access username (if configured) | - |
| 109 | +|`spring.ai.vectorstore.chroma.client.password`| Access password (if configured) | - |
| 110 | +|`spring.ai.vectorstore.chroma.store.collection-name`| Collection name | `SpringAiCollection` |
| 111 | +|=== |
| 112 | + |
| 113 | +[NOTE] |
| 114 | +==== |
| 115 | +For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#static-api-token-authentication[Static API Token Authentication] use the `ChromaApi#withKeyToken(<Your Token Credentials>)` method to set your credentials. Check the `ChromaWhereIT` for an example. |
| 116 | +
|
| 117 | +For ChromaDB secured with link:https://docs.trychroma.com/usage-guide#basic-authentication[Basic Authentication] use the `ChromaApi#withBasicAuth(<your user>, <your password>)` method to set your credentials. Check the `BasicAuthChromaWhereIT` for an example. |
| 118 | +==== |
| 119 | + |
| 120 | +== Metadata filtering |
117 | 121 |
|
118 | 122 | You can leverage the generic, portable link:https://docs.spring.io/spring-ai/reference/api/vectordbs.html#_metadata_filters[metadata filters] with ChromaVector store as well.
|
119 | 123 |
|
@@ -161,6 +165,91 @@ is converted into the proprietary Chroma format
|
161 | 165 | }
|
162 | 166 | ```
|
163 | 167 |
|
| 168 | + |
| 169 | +== Manual Configuration |
| 170 | + |
| 171 | +If you prefer to configure the Chroma Vector Store manually, you can do so by creating a `ChromaVectorStore` bean in your Spring Boot application. |
| 172 | + |
| 173 | +Add these dependencies to your project: |
| 174 | +* Chroma VectorStore. |
| 175 | + |
| 176 | +[source,xml] |
| 177 | +---- |
| 178 | +<dependency> |
| 179 | + <groupId>org.springframework.ai</groupId> |
| 180 | + <artifactId>spring-ai-chroma-store</artifactId> |
| 181 | +</dependency> |
| 182 | +---- |
| 183 | + |
| 184 | +* OpenAI: Required for calculating embeddings. You can use any other embedding client implementation. |
| 185 | + |
| 186 | +[source,xml] |
| 187 | +---- |
| 188 | +<dependency> |
| 189 | + <groupId>org.springframework.ai</groupId> |
| 190 | + <artifactId>spring-ai-openai-spring-boot-starter</artifactId> |
| 191 | +</dependency> |
| 192 | +---- |
| 193 | + |
| 194 | + |
| 195 | +TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file. |
| 196 | + |
| 197 | +=== Sample Code |
| 198 | + |
| 199 | +Create a `RestTemplate` instance with proper ChromaDB authorization configurations and Use it to create a `ChromaApi` instance: |
| 200 | + |
| 201 | +[source,java] |
| 202 | +---- |
| 203 | +@Bean |
| 204 | +public RestTemplate restTemplate() { |
| 205 | + return new RestTemplate(); |
| 206 | +} |
| 207 | +
|
| 208 | +@Bean |
| 209 | +public ChromaApi chromaApi(RestTemplate restTemplate) { |
| 210 | + String chromaUrl = "http://localhost:8000"; |
| 211 | + ChromaApi chromaApi = new ChromaApi(chromaUrl, restTemplate); |
| 212 | + return chromaApi; |
| 213 | +} |
| 214 | +---- |
| 215 | + |
| 216 | +Integrate with OpenAI's embeddings by adding the Spring Boot OpenAI starter to your project. This provides you with an implementation of the Embeddings client: |
| 217 | + |
| 218 | +[source,java] |
| 219 | +---- |
| 220 | +@Bean |
| 221 | +public VectorStore chromaVectorStore(EmbeddingClient embeddingClient, ChromaApi chromaApi) { |
| 222 | + return new ChromaVectorStore(embeddingClient, chromaApi, "TestCollection"); |
| 223 | +} |
| 224 | +---- |
| 225 | + |
| 226 | +In your main code, create some documents: |
| 227 | + |
| 228 | +[source,java] |
| 229 | +---- |
| 230 | +List<Document> documents = List.of( |
| 231 | + new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")), |
| 232 | + new Document("The World is Big and Salvation Lurks Around the Corner"), |
| 233 | + new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2"))); |
| 234 | +---- |
| 235 | + |
| 236 | +Add the documents to your vector store: |
| 237 | + |
| 238 | +[source,java] |
| 239 | +---- |
| 240 | +vectorStore.add(documents); |
| 241 | +---- |
| 242 | + |
| 243 | +And finally, retrieve documents similar to a query: |
| 244 | + |
| 245 | +[source,java] |
| 246 | +---- |
| 247 | +List<Document> results = vectorStore.similaritySearch("Spring"); |
| 248 | +---- |
| 249 | + |
| 250 | +If all goes well, you should retrieve the document containing the text "Spring AI rocks!!". |
| 251 | + |
| 252 | + |
164 | 253 | === Run Chroma Locally
|
165 | 254 |
|
166 | 255 | ```shell
|
|
0 commit comments