Skip to content

OCI GenAI #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions docs/src/main/asciidoc/genai.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
// Copyright (c) 2024, Oracle and/or its affiliates.
// Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/

[#cloud-genai]
== Generative AI

https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm[OCI Generative AI] is a fully managed service that provides a set of state-of-the-art, customizable large language models (LLMs) that cover a wide range of use cases, including chat and creating text embeddings.

Maven coordinates, using <<getting-started.adoc#bill-of-materials, Spring Cloud OCI BOM>>:

[source,xml]
----
<dependency>
<groupId>com.oracle.cloud.spring</groupId>
<artifactId>spring-cloud-oci-starter-gen-ai</artifactId>
</dependency>
----

Gradle coordinates:

[source,subs="normal"]
----
dependencies {
implementation("com.oracle.cloud.spring:spring-cloud-oci-starter-gen-ai")
}
----

=== Using Generative AI Chat

The starter configures and registers a `ChatModel` bean in the Spring application context.
The `ChatModel` bean (link[Javadoc]) can be used to interact with OCI Generative AI Chat Models.

[source,java]
----
@Autowired
@Autowired
private ChatModel chatModel;

public void embed() {
ChatResponse response = chatModel.chat("my chat prompt");
}
----

=== Using Generative AI Embedding

The starter configures and registers an `EmbeddingModel` bean in the Spring application context.
The `EmbeddingModel` bean (link[Javadoc]) can be used to create text embeddings using OCI Generative AI Embedding Models.

[source,java]
----
@Autowired
private EmbeddingModel embeddingModel;

public void embed() {
EmbedTextResponse response = embeddingModel.embed("my embedding text");
}
----

=== Configuration

The Spring Boot Starter for Oracle Cloud Generative AI provides the following configuration options:

|===
^| Name ^| Description ^| Required ^| Default value
| `spring.cloud.oci.genai.enabled` | Enables the OCI Generative AI Client. | No | `true`
| `spring.cloud.oci.genai.embedding.enabled` | Enables the OCI Generative AI Embedding APIs. | No | `false`
| `spring.cloud.oci.genai.embedding.on-demand-model-id` | On-demand model ID to be used for embedding text. One of `spring.cloud.oci.genai.embedding.dedicated-cluster-endpoint` or `spring.cloud.oci.genai.embedding.on-demand-model-id` must be specified | No |
| `spring.cloud.oci.genai.embedding.dedicated-cluster-endpoint` | Dedicated cluster endpoint used for embedding text. One of `spring.cloud.oci.genai.embedding.dedicated-cluster-endpoint` or `spring.cloud.oci.genai.embedding.on-demand-model-id` must be specified. | No |
| `spring.cloud.oci.genai.embedding.compartment` | Embedding model compartment. | Yes |
| `spring.cloud.oci.genai.embedding.truncate` | How to truncate embedding text when it is greater than the model's maximum tokens. May be `START`, `END`, or `NONE`. | No | `NONE`
| `spring.cloud.oci.genai.chat.enabled` | Enables the OCI Generative AI Chat APIs. | No | `false`
| `spring.cloud.oci.genai.chat.on-demand-model-id` | On-demand model ID to be used for chat. One of `spring.cloud.oci.genai.chat.dedicated-cluster-endpoint` or `spring.cloud.oci.genai.chat.on-demand-model-id` must be specified | No |
| `spring.cloud.oci.genai.chat.dedicated-cluster-endpoint` | Dedicated cluster endpoint used for chat. One of `spring.cloud.oci.genai.chat.dedicated-cluster-endpoint` or `spring.cloud.oci.genai.chat.on-demand-model-id` must be specified. | No |
| `spring.cloud.oci.genai.chat.compartment` | Chat model compartment. | Yes |
| `spring.cloud.oci.genai.chat.preample-override` | If specified, overrides the model's prompt preamble. | No |
| `spring.cloud.oci.genai.chat.temperature` | Chat text generation temperature. Higher values are more random or creative. Learn more about https://docs.oracle.com/en-us/iaas/Content/generative-ai/concepts.htm#temperature[temperature]. | No | `1.0`
| `spring.cloud.oci.genai.chat.top-p` | Ensures that only the most likely tokens with probabilities that sum to P are generated. Learn more about https://docs.oracle.com/en-us/iaas/Content/generative-ai/concepts.htm#top-p[Top P]. | No | `0.75`
| `spring.cloud.oci.genai.chat.top-k` | Ensures that only the top K most probably tokens are considered in each step of text generation. Learn more about https://docs.oracle.com/en-us/iaas/Content/generative-ai/concepts.htm#top-k[Top K]. | No | `0.0`
| `spring.cloud.oci.genai.chat.frequency-penalty` | Assigns a penalty for tokens that appear frequently. A higher value results in less repetitive text. | No | `0.0`
| `spring.cloud.oci.genai.chat.presence-penalty` | Assigns an equal penalty if a token appears in the text. A higher value results in less reptitive text. | No | `0.0`
| `spring.cloud.oci.genai.chat.max-tokens` | Maximum response output tokens. Estimate 2-3 tokens per word. | No | `600`

|===


=== Sample

A sample application provided https://github.com/oracle/spring-cloud-oci/tree/main/spring-cloud-oci-samples/spring-cloud-oci-gen-ai-sample[here] contains the examples to demonstrates the usage of OCI Spring Cloud Generative AI module.
3 changes: 3 additions & 0 deletions docs/src/main/asciidoc/getting-started.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ The following table highlights several samples of the most used integrations in
| Cloud Notification
| https://github.com/oracle/spring-cloud-oci/tree/main/spring-cloud-oci-samples/spring-cloud-oci-notification-sample[spring-cloud-oci-notification-sample]

| Cloud Generative AI
| https://github.com/oracle/spring-cloud-oci/tree/main/spring-cloud-oci-samples/spring-cloud-oci-gen-ai-sample[spring-cloud-oci-gen-ai-sample]

| Cloud Logging
| https://github.com/oracle/spring-cloud-oci/tree/main/spring-cloud-oci-samples/spring-cloud-oci-logging-sample[spring-cloud-oci-logging-sample]

Expand Down
2 changes: 2 additions & 0 deletions docs/src/main/asciidoc/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ include::storage.adoc[]

include::notifications.adoc[]

include::genai.adoc[]

include::logging.adoc[]

include::function.adoc[]
Expand Down
1 change: 1 addition & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ Licensed under the Universal Permissive License v 1.0 as shown at https://oss.or
<module>spring-cloud-oci-starters</module>
<module>spring-cloud-oci-storage</module>
<module>spring-cloud-oci-notification</module>
<module>spring-cloud-oci-gen-ai</module>
<module>spring-cloud-oci-logging</module>
<module>spring-cloud-oci-function</module>
<module>spring-cloud-oci-streaming</module>
Expand Down
5 changes: 5 additions & 0 deletions spring-cloud-oci-autoconfigure/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,11 @@ Licensed under the Universal Permissive License v 1.0 as shown at https://oss.or
<artifactId>spring-cloud-oci-notification</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>com.oracle.cloud.spring</groupId>
<artifactId>spring-cloud-oci-gen-ai</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>com.oracle.cloud.spring</groupId>
<artifactId>spring-cloud-oci-logging</artifactId>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
/*
** Copyright (c) 2024, Oracle and/or its affiliates.
** Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/
*/

package com.oracle.cloud.spring.genai;

import com.oracle.bmc.auth.RegionProvider;
import com.oracle.bmc.generativeaiinference.GenerativeAiInference;
import com.oracle.bmc.generativeaiinference.GenerativeAiInferenceClient;
import com.oracle.bmc.generativeaiinference.model.DedicatedServingMode;
import com.oracle.bmc.generativeaiinference.model.EmbedTextDetails;
import com.oracle.bmc.generativeaiinference.model.OnDemandServingMode;
import com.oracle.bmc.generativeaiinference.model.ServingMode;
import com.oracle.cloud.spring.autoconfigure.core.CredentialsProvider;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.autoconfigure.AutoConfiguration;
import org.springframework.boot.autoconfigure.condition.ConditionalOnClass;
import org.springframework.boot.autoconfigure.condition.ConditionalOnMissingBean;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.cloud.context.config.annotation.RefreshScope;
import org.springframework.context.annotation.Bean;
import org.springframework.util.StringUtils;

import static com.oracle.cloud.spring.autoconfigure.core.CredentialsProviderAutoConfiguration.credentialsProviderQualifier;
import static com.oracle.cloud.spring.autoconfigure.core.RegionProviderAutoConfiguration.regionProviderQualifier;

/**
* Auto-configuration for initializing the OCI GenAI component.
* Depends on {@link com.oracle.cloud.spring.autoconfigure.core.CredentialsProviderAutoConfiguration} and
* {@link com.oracle.cloud.spring.autoconfigure.core.RegionProviderAutoConfiguration}
* for loading the Authentication configuration
*
* @see ChatModel
* @see EmbeddingModel
*/
@AutoConfiguration
@ConditionalOnClass({ChatModel.class})
@EnableConfigurationProperties(GenAIProperties.class)
@ConditionalOnProperty(name = "spring.cloud.oci.genai.enabled", havingValue = "true", matchIfMissing = true)
public class GenAIAutoConfiguration {
private final GenAIProperties properties;

public GenAIAutoConfiguration(GenAIProperties properties) {
this.properties = properties;
}

@Bean
@RefreshScope
@ConditionalOnProperty(name = "spring.cloud.oci.genai.embedding.enabled", havingValue = "true", matchIfMissing = true)
@ConditionalOnMissingBean(EmbeddingModel.class)
public EmbeddingModel embeddingModel(GenerativeAiInference generativeAiInference) {
GenAIProperties.Embedding embedding = properties.getEmbedding();
return EmbeddingModelImpl.builder()
.client(generativeAiInference)
.truncate(StringUtils.hasText(embedding.getTruncate()) ?
EmbedTextDetails.Truncate.valueOf(embedding.getTruncate()) :
EmbedTextDetails.Truncate.None)
.compartment(embedding.getCompartment())
.servingMode(servingMode(embedding.getOnDemandModelId(), embedding.getDedicatedClusterEndpoint()))
.build();
}

@Bean
@RefreshScope
@ConditionalOnProperty(name = "spring.cloud.oci.genai.chat.enabled", havingValue = "true", matchIfMissing = true)
@ConditionalOnMissingBean(ChatModel.class)
public ChatModel chatModel(GenerativeAiInference generativeAiInference) {
GenAIProperties.Chat chat = properties.getChat();
return ChatModelImpl.builder()
.client(generativeAiInference)
.preambleOverride(chat.getPreambleOverride())
.inferenceRequestType(chat.getInferenceRequestType())
.servingMode(servingMode(chat.getOnDemandModelId(), chat.getDedicatedClusterEndpoint()))
.topK(chat.getTopK())
.topP(chat.getTopP())
.compartment(chat.getCompartment())
.frequencyPenalty(chat.getFrequencyPenalty())
.presencePenalty(chat.getPresencePenalty())
.temperature(chat.getTemperature())
.build();
}

@Bean
@RefreshScope
@ConditionalOnMissingBean
GenerativeAiInference genAIClient(@Qualifier(regionProviderQualifier) RegionProvider regionProvider,
@Qualifier(credentialsProviderQualifier)
CredentialsProvider cp) {
GenerativeAiInference generativeAiInference = GenerativeAiInferenceClient.builder()
.build(cp.getAuthenticationDetailsProvider());
if (regionProvider.getRegion() != null) {
generativeAiInference.setRegion(regionProvider.getRegion());
}
return generativeAiInference;
}

private ServingMode servingMode(String onDemandModelId, String dedicatedClusterEndpoint) {
if (StringUtils.hasText(onDemandModelId)) {
return OnDemandServingMode.builder().modelId(onDemandModelId).build();
} else if (StringUtils.hasText(dedicatedClusterEndpoint)) {
return DedicatedServingMode.builder().endpointId(dedicatedClusterEndpoint).build();
}
throw new IllegalArgumentException("One of spring.cloud.oci.genai.embedding.onDemandModelId or spring.cloud.oci.genai.embedding.dedicatedClusterEndpoint must be specified.");
}
}
Loading
Loading