Skip to content

Commit 37ac3c7

Browse files
committed
Updating naming and support for multiple completions
1 parent b827ecf commit 37ac3c7

File tree

2 files changed

+41
-13
lines changed

2 files changed

+41
-13
lines changed

README.md

Lines changed: 27 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -120,24 +120,45 @@ func session(ctx context.Context, agent llm.Agent) error {
120120
You can add options to sessions, or to prompts. Different providers and models support
121121
different options.
122122

123+
```go
124+
type Model interface {
125+
// Set session-wide options
126+
Context(...Opt) Context
127+
128+
// Add attachments (images, PDF's) to a user prompt
129+
UserPrompt(string, ...Opt) Context
130+
131+
// Set embedding options
132+
Embedding(context.Context, string, ...Opt) ([]float64, error)
133+
}
134+
135+
type Context interface {
136+
// Add single-use options when calling the model, which override
137+
// session options. You can also attach files to a user prompt.
138+
FromUser(context.Context, string, ...Opt) error
139+
}
140+
```
141+
142+
The options are as follows:
143+
123144
| Option | Ollama | Anthropic | Mistral | OpenAI | Description |
124145
|--------|--------|-----------|---------|--------|-------------|
125146
| `llm.WithTemperature(float64)` | Yes | Yes | Yes | - | What sampling temperature to use, between 0.0 and 1.0. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |
126147
| `llm.WithTopP(float64)` | Yes | Yes | Yes | - | Nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. |
127148
| `llm.WithTopK(uint64)` | Yes | Yes | No | - | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. |
128149
| `llm.WithMaxTokens(uint64)` | - | Yes | Yes | - | The maximum number of tokens to generate in the response. |
129-
| `llm.WithStream(func(llm.ContextContent))` | Can be enabled when tools are not used | Yes | Yes | - | Stream the response to a function. |
150+
| `llm.WithStream(func(llm.Completion))` | Can be enabled when tools are not used | Yes | Yes | - | Stream the response to a function. |
130151
| `llm.WithToolChoice(string, string, ...)` | No | Yes | Use `auto`, `any`, `none`, `required` or a function name. Only the first argument is used. | - | The tool to use for the model. |
131152
| `llm.WithToolKit(llm.ToolKit)` | Cannot be combined with streaming | Yes | Yes | - | The set of tools to use. |
132153
| `llm.WithStopSequence(string, string, ...)` | Yes | Yes | Yes | - | Stop generation if one of these tokens is detected. |
133154
| `llm.WithSystemPrompt(string)` | No | Yes | Yes | - | Set the system prompt for the model. |
134155
| `llm.WithSeed(uint64)` | No | Yes | Yes | - | The seed to use for random sampling. If set, different calls will generate deterministic results. |
135156
| `llm.WithFormat(string)` | No | Yes | Use `json_format` or `text` | - | The format of the response. For Mistral, you must also instruct the model to produce JSON yourself with a system or a user message. |
136-
| `mistral.WithPresencePenalty(float64)` | - | - | Yes | - | Determines how much the model penalizes the repetition of words or phrases. A higher presence penalty encourages the model to use a wider variety of words and phrases, making the output more diverse and creative. |
137-
| `mistral.WithFequencyPenalty(float64)` | - | - | Yes | - | Penalizes the repetition of words based on their frequency in the generated text. A higher frequency penalty discourages the model from repeating words that have already appeared frequently in the output, promoting diversity and reducing repetition. |
138-
| `mistral.WithPrediction(string)` | - | - | Yes | - | Enable users to specify expected results, optimizing response times by leveraging known or predictable content. This approach is especially effective for updating text documents or code files with minimal changes, reducing latency while maintaining high-quality results. |
139-
| `llm.WithSafePrompt()` | - | - | Yes | - | Whether to inject a safety prompt before all conversations. |
140-
| `llm.WithNumCompletions(uint64)` | - | - | Yes | - | Number of completions to return for each request. |
157+
| `mistral.WithPresencePenalty(float64)` | No | No | Yes | - | Determines how much the model penalizes the repetition of words or phrases. A higher presence penalty encourages the model to use a wider variety of words and phrases, making the output more diverse and creative. |
158+
| `mistral.WithFequencyPenalty(float64)` | No | No | Yes | - | Penalizes the repetition of words based on their frequency in the generated text. A higher frequency penalty discourages the model from repeating words that have already appeared frequently in the output, promoting diversity and reducing repetition. |
159+
| `mistral.WithPrediction(string)` | No | No | Yes | - | Enable users to specify expected results, optimizing response times by leveraging known or predictable content. This approach is especially effective for updating text documents or code files with minimal changes, reducing latency while maintaining high-quality results. |
160+
| `llm.WithSafePrompt()` | No | No | Yes | - | Whether to inject a safety prompt before all conversations. |
161+
| `llm.WithNumCompletions(uint64)` | No | No | Yes | - | Number of completions to return for each request. |
141162
| `llm.WithAttachment(io.Reader)` | Yes | Yes | Yes | - | Attach a file to a user prompt. It is the responsibility of the caller to close the reader. |
142163
| `antropic.WithEphemeral()` | No | Yes | No | - | Attachments should be cached server-side |
143164
| `antropic.WithCitations()` | No | Yes | No | - | Attachments should be used in citations |

context.go

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,28 @@ import "context"
55
//////////////////////////////////////////////////////////////////
66
// TYPES
77

8-
// ContextContent is the content of the last context message
9-
type ContextContent interface {
8+
// Completion is the content of the last context message
9+
type Completion interface {
10+
// Return the number of completions, which is ususally 1 unless
11+
// WithNumCompletions was used when calling the model
12+
Num() int
13+
1014
// Return the current session role, which can be system, assistant, user, tool, tool_result, ...
15+
// If this is a completion, the role is usually 'assistant'
1116
Role() string
1217

13-
// Return the current session text, or empty string if no text was returned
14-
Text() string
18+
// Return the text for the last completion. If multiple completions are not
19+
// supported, the argument is ignored.
20+
Text(int) string
1521

16-
// Return the current session tool calls, or empty if no tool calls were made
17-
ToolCalls() []ToolCall
22+
// Return the current session tool calls given the completion index.
23+
// Will return nil if no tool calls were returned
24+
ToolCalls(int) []ToolCall
1825
}
1926

2027
// Context is fed to the agent to generate a response
2128
type Context interface {
22-
ContextContent
29+
Completion
2330

2431
// Generate a response from a user prompt (with attachments and
2532
// other options)

0 commit comments

Comments
 (0)