Skip to content

Commit 447c0ca

Browse files
Make all prompt templates configurable (#400)
* Make all prompt templates configurable This PR enables all prompts to be configurable using handlebar templates as described in #382. For backward compatibility the old hardcoded templates are reimplemented as the default templates. The old template parameters such as `preprompt`, `userMessageToken`, `userMessageEndToken`, `assistantMessageToken`, `assistantMessageEndToken` are now considered legacy. They still work as they are exposed as variables to the default template. However, new prompt configurations should not use these. And it is recommended that the legacy support is eventually removed. As an example, this is how the default chat prompt template is implemented: ``` {{preprompt}} {{#each messages}} {{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}} {{#ifAssistant}}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}} {{/each}} {{assistantMessageToken}} ``` In addition, this PR fixes an issue where the `model` configuration was used to generate the prompts in WebSearch. However, the `defaultModel` was used to query. This caused issues when the `model` and `defaultModel` uses different prompt configurations. This has now been changed to always use the `defaultModel`. Note, when developing this PR, it has been observed that the WebSearch prompts can violate typical model assumptions. For example, a query may be generated as: ``` Assistant: The following context ... User: content ... ``` ``` User: user message 1 User: user message 2 Assistant: ``` Models typically assume the prompts to be User -> Assistant -> User. For best compatability with existing configurations, this issues was not fixed. Instead, the old behavior is maintained with the default templates. This is also the reason why `defaultModel` was chosen as the WebSearch model instead of `model`. As `defaultModel` may allow the WebSearch format, while `model` might not. This behavior, as well as the overall aritecture of chat-ui, necessitated that the template input maintained the format ``` messages: [{from: 'user' | 'assistant', content: string }] ``` For the template to be able to detect which source a message comes from a `ifUser` and a `ifAssistant` handlebar block-helper was implemented. The original proposed format in #382 was: ``` history: [{ user: string, assistant: string }] ``` However, using such format would require significant changes to the project and would make it impossible to implement the existing websearch templates. Finally, there may be minor differences in how truncation is implemented. As in some cases, truncation is now applied to the entire prompt, rather than part of the prompt. Fixes: #382 * Add Sagemaker support (#401) * work on sagemaker support * fix sagemaker integration * remove unnecessary deps * fix default endpoint * remove unneeded deps, fixed types * Use conditional validation for endpoints This was needed because the discriminated union couldn't handle the legacy case where `host` is undefined. * add note in readme about aws sagemaker * lint * -summery +summary --------- Co-authored-by: Nathan Sarrazin <sarrazin.nathan@gmail.com>
1 parent bea3bcf commit 447c0ca

File tree

10 files changed

+245
-65
lines changed

10 files changed

+245
-65
lines changed

README.md

Lines changed: 68 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -120,9 +120,8 @@ MODELS=`[
120120
"websiteUrl": "https://open-assistant.io",
121121
"userMessageToken": "<|prompter|>", # This does not need to be a token, can be any string
122122
"assistantMessageToken": "<|assistant|>", # This does not need to be a token, can be any string
123-
"messageEndToken": "<|endoftext|>", # This does not need to be a token, can be any string
124-
# "userMessageEndToken": "", # Applies only to user messages, messageEndToken has no effect if specified. Can be any string.
125-
# "assistantMessageEndToken": "", # Applies only to assistant messages, messageEndToken has no effect if specified. Can be any string.
123+
"userMessageEndToken": "<|endoftext|>", # Applies only to user messages. Can be any string.
124+
"assistantMessageEndToken": "<|endoftext|>", # Applies only to assistant messages. Can be any string.
126125
"preprompt": "Below are a series of dialogues between various people and an AI assistant. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
127126
"promptExamples": [
128127
{
@@ -152,7 +151,72 @@ MODELS=`[
152151

153152
You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.
154153

155-
### Running your own models using a custom endpoint
154+
#### Custom prompt templates:
155+
156+
By default the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
157+
158+
However, these templates can be modified by setting the `chatPromptTemplate`, `webSearchSummaryPromptTemplate`, and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is https://handlebarsjs.com. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
159+
160+
For example:
161+
162+
```
163+
<System>You are an AI, called ChatAI.</System>
164+
{{#each messages}}
165+
{{#ifUser}}<User>{{content}}</User>{{/ifUser}}
166+
{{#ifAssistant}}<Assistant>{{content}}</Assistant>{{/ifAssistant}}
167+
{{/each}}
168+
<Assistant>
169+
```
170+
171+
**chatPromptTemplate**
172+
173+
When quering the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To idenify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
174+
175+
The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability.
176+
177+
```
178+
{{preprompt}}
179+
{{#each messages}}
180+
{{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
181+
{{#ifAssistant}}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}}
182+
{{/each}}
183+
{{assistantMessageToken}}
184+
```
185+
186+
**webSearchQueryPromptTemplate**
187+
188+
When performing a websearch, the search query is constructed using the `webSearchQueryPromptTemplate` template. It is recommended that that the prompt instructs the chat model to only return a few keywords.
189+
190+
The following is the default `webSearchQueryPromptTemplate`. Note that not all models supports consecutive user-messages which this template uses.
191+
192+
```
193+
{{userMessageToken}}
194+
The following messages were written by a user, trying to answer a question.
195+
{{userMessageEndToken}}
196+
{{#each messages}}
197+
{{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
198+
{{/each}}
199+
{{userMessageToken}}
200+
What plain-text english sentence would you input into Google to answer the last question? Answer with a short (10 words max) simple sentence.
201+
{{userMessageEndToken}}
202+
{{assistantMessageToken}}Query:
203+
```
204+
205+
**webSearchSummaryPromptTemplate**
206+
207+
The search-engine response (`answer`) is summarized using the following prompt template. However, when `HF_ACCESS_TOKEN` is provided, a dedicated summary model is used instead. Additionally, the model's `query` response to `webSearchQueryPromptTemplate` is also available to this template.
208+
209+
The following is the default `webSearchSummaryPromptTemplate`. Note that not all models supports consecutive user-messages which this template uses.
210+
211+
```
212+
{{userMessageToken}}{{answer}}{{userMessageEndToken}}
213+
{{userMessageToken}}
214+
The text above should be summarized to best answer the query: {{query}}.
215+
{{userMessageEndToken}}
216+
{{assistantMessageToken}}Summary:
217+
```
218+
219+
#### Running your own models using a custom endpoint
156220

157221
If you want to, instead of hitting models on the Hugging Face Inference API, you can run your own models locally.
158222

package-lock.json

Lines changed: 51 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
"aws4fetch": "^1.0.17",
4747
"date-fns": "^2.29.3",
4848
"dotenv": "^16.0.3",
49+
"handlebars": "^4.7.8",
4950
"highlight.js": "^11.7.0",
5051
"jsdom": "^22.0.0",
5152
"marked": "^4.3.0",

src/lib/buildPrompt.ts

Lines changed: 14 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,6 @@ export async function buildPrompt(
1313
model: BackendModel,
1414
webSearchId?: string
1515
): Promise<string> {
16-
const userEndToken = model.userMessageEndToken ?? model.messageEndToken;
17-
const assistantEndToken = model.assistantMessageEndToken ?? model.messageEndToken;
18-
19-
const prompt =
20-
messages
21-
.map((m) =>
22-
m.from === "user"
23-
? model.userMessageToken +
24-
m.content +
25-
(m.content.endsWith(userEndToken) ? "" : userEndToken)
26-
: model.assistantMessageToken +
27-
m.content +
28-
(m.content.endsWith(assistantEndToken) ? "" : assistantEndToken)
29-
)
30-
.join("") + model.assistantMessageToken;
31-
32-
let webPrompt = "";
33-
3416
if (webSearchId) {
3517
const webSearch = await collections.webSearches.findOne({
3618
_id: new ObjectId(webSearchId),
@@ -39,20 +21,22 @@ export async function buildPrompt(
3921
if (!webSearch) throw new Error("Web search not found");
4022

4123
if (webSearch.summary) {
42-
webPrompt =
43-
model.assistantMessageToken +
44-
`The following context was found while searching the internet: ${webSearch.summary}` +
45-
model.assistantMessageEndToken;
24+
messages = [
25+
{
26+
from: "assistant",
27+
content: `The following context was found while searching the internet: ${webSearch.summary}`,
28+
},
29+
...messages,
30+
];
4631
}
4732
}
48-
const finalPrompt =
49-
model.preprompt +
50-
webPrompt +
51-
prompt
33+
34+
return (
35+
model
36+
.chatPromptRender({ messages })
37+
// Not super precise, but it's truncated in the model's backend anyway
5238
.split(" ")
5339
.slice(-(model.parameters?.truncate ?? 0))
54-
.join(" ");
55-
56-
// Not super precise, but it's truncated in the model's backend anyway
57-
return finalPrompt;
40+
.join(" ")
41+
);
5842
}

src/lib/server/models.ts

Lines changed: 50 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,10 @@
11
import { HF_ACCESS_TOKEN, MODELS, OLD_MODELS } from "$env/static/private";
2+
import type {
3+
ChatTemplateInput,
4+
WebSearchQueryTemplateInput,
5+
WebSearchSummaryTemplateInput,
6+
} from "$lib/types/Template";
7+
import { compileTemplate } from "$lib/utils/template";
28
import { z } from "zod";
39

410
const sagemakerEndpoint = z.object({
@@ -46,13 +52,46 @@ const modelsRaw = z
4652
modelUrl: z.string().url().optional(),
4753
datasetName: z.string().min(1).optional(),
4854
datasetUrl: z.string().url().optional(),
49-
userMessageToken: z.string(),
55+
userMessageToken: z.string().default(""),
5056
userMessageEndToken: z.string().default(""),
51-
assistantMessageToken: z.string(),
57+
assistantMessageToken: z.string().default(""),
5258
assistantMessageEndToken: z.string().default(""),
5359
messageEndToken: z.string().default(""),
5460
preprompt: z.string().default(""),
5561
prepromptUrl: z.string().url().optional(),
62+
chatPromptTemplate: z
63+
.string()
64+
.default(
65+
"{{preprompt}}" +
66+
"{{#each messages}}" +
67+
"{{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}" +
68+
"{{#ifAssistant}}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}}" +
69+
"{{/each}}" +
70+
"{{assistantMessageToken}}"
71+
),
72+
webSearchSummaryPromptTemplate: z
73+
.string()
74+
.default(
75+
"{{userMessageToken}}{{answer}}{{userMessageEndToken}}" +
76+
"{{userMessageToken}}" +
77+
"The text above should be summarized to best answer the query: {{query}}." +
78+
"{{userMessageEndToken}}" +
79+
"{{assistantMessageToken}}Summary: "
80+
),
81+
webSearchQueryPromptTemplate: z
82+
.string()
83+
.default(
84+
"{{userMessageToken}}" +
85+
"The following messages were written by a user, trying to answer a question." +
86+
"{{userMessageEndToken}}" +
87+
"{{#each messages}}" +
88+
"{{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}" +
89+
"{{/each}}" +
90+
"{{userMessageToken}}" +
91+
"What plain-text english sentence would you input into Google to answer the last question? Answer with a short (10 words max) simple sentence." +
92+
"{{userMessageEndToken}}" +
93+
"{{assistantMessageToken}}Query: "
94+
),
5695
promptExamples: z
5796
.array(
5897
z.object({
@@ -80,6 +119,15 @@ export const models = await Promise.all(
80119
...m,
81120
userMessageEndToken: m?.userMessageEndToken || m?.messageEndToken,
82121
assistantMessageEndToken: m?.assistantMessageEndToken || m?.messageEndToken,
122+
chatPromptRender: compileTemplate<ChatTemplateInput>(m.chatPromptTemplate, m),
123+
webSearchSummaryPromptRender: compileTemplate<WebSearchSummaryTemplateInput>(
124+
m.webSearchSummaryPromptTemplate,
125+
m
126+
),
127+
webSearchQueryPromptRender: compileTemplate<WebSearchQueryTemplateInput>(
128+
m.webSearchQueryPromptTemplate,
129+
m
130+
),
83131
id: m.id || m.name,
84132
displayName: m.displayName || m.name,
85133
preprompt: m.prepromptUrl ? await fetch(m.prepromptUrl).then((r) => r.text()) : m.preprompt,

src/lib/server/websearch/generateQuery.ts

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,9 @@
11
import type { Message } from "$lib/types/Message";
22
import { generateFromDefaultEndpoint } from "../generateFromDefaultEndpoint";
3-
import type { BackendModel } from "../models";
4-
5-
export async function generateQuery(messages: Message[], model: BackendModel) {
6-
const promptSearchQuery =
7-
model.userMessageToken +
8-
"The following messages were written by a user, trying to answer a question." +
9-
model.userMessageEndToken +
10-
messages
11-
.filter((message) => message.from === "user")
12-
.map((message) => model.userMessageToken + message.content + model.userMessageEndToken) +
13-
model.userMessageToken +
14-
"What plain-text english sentence would you input into Google to answer the last question? Answer with a short (10 words max) simple sentence." +
15-
model.userMessageEndToken +
16-
model.assistantMessageToken +
17-
"Query: ";
3+
import { defaultModel } from "../models";
184

5+
export async function generateQuery(messages: Message[]) {
6+
const promptSearchQuery = defaultModel.webSearchQueryPromptRender({ messages });
197
const searchQuery = await generateFromDefaultEndpoint(promptSearchQuery).then((query) => {
208
const arr = query.split(/\r?\n/);
219
return arr[0].length > 0 ? arr[0] : arr[1];

src/lib/server/websearch/summarizeWeb.ts

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
import { HF_ACCESS_TOKEN } from "$env/static/private";
22
import { HfInference } from "@huggingface/inference";
3-
import { generateFromDefaultEndpoint } from "../generateFromDefaultEndpoint";
3+
import { defaultModel } from "$lib/server/models";
44
import type { BackendModel } from "../models";
5+
import { generateFromDefaultEndpoint } from "../generateFromDefaultEndpoint";
56

67
export async function summarizeWeb(content: string, query: string, model: BackendModel) {
78
// if HF_ACCESS_TOKEN is set, we use a HF dedicated endpoint for summarization
@@ -23,19 +24,13 @@ export async function summarizeWeb(content: string, query: string, model: Backen
2324
}
2425

2526
// else we use the LLM to generate a summary
26-
const summaryPrompt =
27-
model.userMessageToken +
28-
content
27+
const summaryPrompt = defaultModel.webSearchSummaryPromptRender({
28+
answer: content
2929
.split(" ")
3030
.slice(0, model.parameters?.truncate ?? 0)
31-
.join(" ") +
32-
model.userMessageEndToken +
33-
model.userMessageToken +
34-
`The text above should be summarized to best answer the query: ${query}.` +
35-
model.userMessageEndToken +
36-
model.assistantMessageToken +
37-
"Summary: ";
38-
31+
.join(" "),
32+
query: query,
33+
});
3934
const summary = await generateFromDefaultEndpoint(summaryPrompt).then((txt: string) =>
4035
txt.trim()
4136
);

0 commit comments

Comments
 (0)