|
2 | 2 | = Chat Client API
|
3 | 3 |
|
4 | 4 | The `ChatClient` offers a fluent API for communicating with an AI Model.
|
5 |
| -It supports both a synchronous and streaming programming model. |
| 5 | +It supports both a synchronous and streaming programming model. |
| 6 | + |
| 7 | +[NOTE] |
| 8 | +==== |
| 9 | +See the xref:api/chatclient.adoc#_implementation_notes[Implementation Notes] at the bottom of this document related to the combined use of imperative and reactive programming models in `ChatClient` |
| 10 | +==== |
6 | 11 |
|
7 | 12 | The fluent API has methods for building up the constituent parts of a xref:api/prompt.adoc#_prompt[Prompt] that is passed to the AI model as input.
|
8 | 13 | The `Prompt` contains the instructional text to guide the AI model's output and behavior. From the API point of view, prompts consist of a collection of messages.
|
@@ -620,3 +625,21 @@ There is currently one built-in implementation: `MessageWindowChatMemory`.
|
620 | 625 | The `MessageWindowChatMemory` is backed by the `ChatMemoryRepository` abstraction which provides storage implementations for the chat conversation memory. There are several implementations available, including the `InMemoryChatMemoryRepository`, `JdbcChatMemoryRepository`, `CassandraChatMemoryRepository` and `Neo4jChatMemoryRepository`.
|
621 | 626 |
|
622 | 627 | For more details and usage examples, see the xref:api/chat-memory.adoc[Chat Memory] documentation.
|
| 628 | + |
| 629 | +== Implementation Notes |
| 630 | + |
| 631 | +The combined use of imperative and reactive programming models in `ChatClient` is a unique aspect of the API. |
| 632 | +Often an application will be either reactive or imperative, but not both. |
| 633 | + |
| 634 | + |
| 635 | +* When customizing the HTTP client interactions of a Model implementation, both the RestClient and the WebClient must be configured. |
| 636 | + |
| 637 | +[IMPORTANT] |
| 638 | +==== |
| 639 | +Due to a bug in Spring Boot 3.4, the "spring.http.client.factory=jdk" property must be set. Otherwise, it's set to "reactor" by default, which breaks certain AI workflows like the ImageModel. |
| 640 | +==== |
| 641 | + |
| 642 | +* Streaming is only supported via the Reactive stack. Imperative applications must include the Reactive stack for this reason (e.g. spring-boot-starter-webflux). |
| 643 | +* Non-streaming is only supportive via the Servlet stack. Reactive applications must include the Servlet stack for this reason (e.g. spring-boot-starter-web) and expect some calls to be blocking. |
| 644 | +* Tool calling is imperative, leading to blocking workflows. This also results in partial/interrupted Micrometer observations (e.g. the ChatClient spans and the tool calling spans are not connected, with the first one remaining incomplete for that reason). |
| 645 | +* The built-in advisors perform blocking operations for standards calls, and non-blocking operations for streaming calls. The Reactor Scheduler used for the advisor streaming calls can be configured via the Builder on each Advisor class. |
0 commit comments