Skip to content

Enhancing Elasticsearch vector store implementation #592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@
<pgvector.version>0.1.4</pgvector.version>
<sap.hanadb.version>2.20.11</sap.hanadb.version>
<postgresql.version>42.7.2</postgresql.version>
<elasticsearch-java.version>8.13.3</elasticsearch-java.version>
<milvus.version>2.3.4</milvus.version>
<pinecone.version>0.8.0</pinecone.version>
<fastjson.version>2.0.46</fastjson.version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -134,11 +134,18 @@ Properties starting with the `spring.ai.vectorstore.elasticsearch.*` prefix are

|`spring.ai.vectorstore.elasticsearch.index-name` | The name of the index to store the vectors. | spring-ai-document-index
|`spring.ai.vectorstore.elasticsearch.dimensions` | The number of dimensions in the vector. | 1536
|`spring.ai.vectorstore.elasticsearch.dense-vector-indexing` | Whether to use dense vector indexing. | true
|`spring.ai.vectorstore.elasticsearch.similarity` | The similarity function to use. | `cosine`
|`spring.ai.vectorstore.elasticsearch.initialize-schema`| whether to initialize the required schema | `false`
|===

The following similarity functions are available:

* cosine
* l2_norm
* dot_product

More details about each in the https://www.elastic.co/guide/en/elasticsearch/reference/master/dense-vector.html#dense-vector-params[Elasticsearch Documentation] on dense vectors.

== Metadata Filtering

You can leverage the generic, portable xref:api/vectordbs.adoc#metadata-filters[metadata filters] with Elasticsearch as well.
Expand Down Expand Up @@ -214,10 +221,11 @@ Read the link:https://www.elastic.co/guide/en/elasticsearch/client/java-api-clie
----
@Bean
public RestClient restClient() {
RestClientBuilder builder = RestClient.builder(new HttpHost("<host>", 9200, "http"));
Header[] defaultHeaders = new Header[] { new BasicHeader("Authorization", "Basic <encoded username and password>") };
builder.setDefaultHeaders(defaultHeaders);
return builder.build();
RestClient.builder(new HttpHost("<host>", 9200, "http"))
.setDefaultHeaders(new Header[]{
new BasicHeader("Authorization", "Basic <encoded username and password>")
})
.build();
}
----

Expand Down
7 changes: 7 additions & 0 deletions spring-ai-spring-boot-autoconfigure/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,7 @@
<optional>true</optional>
</dependency>

<!-- Elasticsearch Vector Store-->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-elasticsearch-store</artifactId>
Expand All @@ -281,6 +282,12 @@
<optional>true</optional>
</dependency>

<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to add the elasticsearch-java explicitly here? As it is already defined in the vector-store dependenies?

<groupId>co.elastic.clients</groupId>
<artifactId>elasticsearch-java</artifactId>
<version>${elasticsearch-java.version}</version>
</dependency>

<!-- test dependencies -->

<dependency>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,7 @@ ElasticsearchVectorStore vectorStore(ElasticsearchVectorStoreProperties properti
if (properties.getDimensions() != null) {
elasticsearchVectorStoreOptions.setDimensions(properties.getDimensions());
}
if (properties.isDenseVectorIndexing() != null) {
elasticsearchVectorStoreOptions.setDenseVectorIndexing(properties.isDenseVectorIndexing());
}
if (StringUtils.hasText(properties.getSimilarity())) {
if (properties.getSimilarity() != null) {
elasticsearchVectorStoreOptions.setSimilarity(properties.getSimilarity());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
package org.springframework.ai.autoconfigure.vectorstore.elasticsearch;

import org.springframework.ai.autoconfigure.CommonVectorStoreProperties;
import org.springframework.ai.vectorstore.SimilarityFunction;
import org.springframework.boot.context.properties.ConfigurationProperties;

/**
Expand All @@ -37,15 +38,10 @@ public class ElasticsearchVectorStoreProperties extends CommonVectorStorePropert
*/
private Integer dimensions;

/**
* Whether to use dense vector indexing.
*/
private Boolean denseVectorIndexing;

/**
* The similarity function to use.
*/
private String similarity;
private SimilarityFunction similarity;

public String getIndexName() {
return this.indexName;
Expand All @@ -63,19 +59,11 @@ public void setDimensions(Integer dimensions) {
this.dimensions = dimensions;
}

public Boolean isDenseVectorIndexing() {
return denseVectorIndexing;
}

public void setDenseVectorIndexing(Boolean denseVectorIndexing) {
this.denseVectorIndexing = denseVectorIndexing;
}

public String getSimilarity() {
public SimilarityFunction getSimilarity() {
return similarity;
}

public void setSimilarity(String similarity) {
public void setSimilarity(SimilarityFunction similarity) {
this.similarity = similarity;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,12 @@
import org.awaitility.Awaitility;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.condition.EnabledIfEnvironmentVariable;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.ValueSource;
import org.springframework.ai.autoconfigure.openai.OpenAiAutoConfiguration;
import org.springframework.ai.autoconfigure.retry.SpringAiRetryAutoConfiguration;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.ElasticsearchVectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.SimilarityFunction;
import org.springframework.boot.autoconfigure.AutoConfigurations;
import org.springframework.boot.autoconfigure.elasticsearch.ElasticsearchRestClientAutoConfiguration;
import org.springframework.boot.autoconfigure.web.client.RestClientAutoConfiguration;
Expand All @@ -51,8 +50,6 @@ class ElasticsearchVectorStoreAutoConfigurationIT {
"docker.elastic.co/elasticsearch/elasticsearch:8.12.2")
.withEnv("xpack.security.enabled", "false");

private static final String DEFAULT = "default cosine similarity";

private List<Document> documents = List.of(
new Document("1", getText("classpath:/test/data/spring.ai.txt"), Map.of("meta1", "meta1")),
new Document("2", getText("classpath:/test/data/time.shelter.txt"), Map.of()),
Expand All @@ -65,21 +62,14 @@ class ElasticsearchVectorStoreAutoConfigurationIT {
.withPropertyValues("spring.elasticsearch.uris=" + elasticsearchContainer.getHttpHostAddress(),
"spring.ai.openai.api-key=" + System.getenv("OPENAI_API_KEY"));

@ParameterizedTest(name = "{0} : {displayName} ")
@ValueSource(strings = { DEFAULT, """
double value = dotProduct(params.query_vector, 'embedding');
return sigmoid(1, Math.E, -value);
""", "1 / (1 + l1norm(params.query_vector, 'embedding'))",
"1 / (1 + l2norm(params.query_vector, 'embedding'))" })
public void addAndSearchTest(String similarityFunction) {
// No parametrized test based on similarity function,
// by default the bean will be created using cosine.
@Test
public void addAndSearchTest() {

this.contextRunner.run(context -> {
ElasticsearchVectorStore vectorStore = context.getBean(ElasticsearchVectorStore.class);

if (!DEFAULT.equals(similarityFunction)) {
vectorStore.withSimilarityFunction(similarityFunction);
}

vectorStore.add(documents);

Awaitility.await()
Expand Down Expand Up @@ -120,16 +110,15 @@ public void propertiesTest() {
"spring.ai.vectorstore.elasticsearch.index-name=example",
"spring.ai.vectorstore.elasticsearch.dimensions=1024",
"spring.ai.vectorstore.elasticsearch.dense-vector-indexing=true",
"spring.ai.vectorstore.elasticsearch.similarity=dot_product")
"spring.ai.vectorstore.elasticsearch.similarity=cosine")
.run(context -> {
var properties = context.getBean(ElasticsearchVectorStoreProperties.class);
var elasticsearchVectorStore = context.getBean(ElasticsearchVectorStore.class);

assertThat(properties).isNotNull();
assertThat(properties.getIndexName()).isEqualTo("example");
assertThat(properties.getDimensions()).isEqualTo(1024);
assertThat(properties.isDenseVectorIndexing()).isTrue();
assertThat(properties.getSimilarity()).isEqualTo("dot_product");
assertThat(properties.getSimilarity()).isEqualTo(SimilarityFunction.cosine);

assertThat(elasticsearchVectorStore).isNotNull();
});
Expand Down
2 changes: 1 addition & 1 deletion vector-stores/spring-ai-elasticsearch-store/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
<dependency>
<groupId>co.elastic.clients</groupId>
<artifactId>elasticsearch-java</artifactId>
<version>${elasticsearch-java.version}</version>
</dependency>

<!-- TESTING -->
Expand All @@ -45,7 +46,6 @@
<scope>test</scope>
</dependency>


<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-test</artifactId>
Expand Down
Loading