Skip to content

Commit ebac87f

Browse files
Mishigjulien-cgary149
authored
[Websearch] update (#427)
* Fix reuqest body * update webSearchQueryPromptTemplate * update generate google query parser * Add today's date to google search query creator * crawl top stories if exts; remove answer_box & knowledgeGraph * Create paragraph chunks from top articles * flattened paragprah chunks * update status texts * add gradio client * call gradio app for RAG * Web scrape only "p, li, span" els * add MAX_N_CHUNKS * gradio result typing * parse only <p> elements * rm dev change * update typing WebSearch * buld RAG prompt * Rm dev change * change websearch context msg from user to assisntat type * use hosted gradio app * fix lint * prompt engineering * more prompt engineering * MAX_N_PAGES_SCRAPE = 10 * better error msg * more prompt engineering * revert websearch prompt to previous * rm `top_stories` from websearch as the results are not good * Stop using gradio client, use regular fetch * chore * Rm websearchsummary references as it is no longer used * update readme * Apply suggestions from code review Co-authored-by: Julien Chaumond <julien@huggingface.co> * Use tfjs to do embeddings in server node * fix websearch component disapperar after finishing generation * Show sources of closest embeddings used in RAG * fix prompting and also add current date * add comment * comment for search query * sources * hide www * using hostname direclty * Show successful web pages instead of failed ones * rm noisy messages * google query generation using previous messaages as context * handle falcon generation * bring back Browsing webpage msg --------- Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Victor Mustar <victor.mustar@gmail.com>
1 parent 0953d85 commit ebac87f

File tree

19 files changed

+867
-199
lines changed

19 files changed

+867
-199
lines changed

README.md

Lines changed: 5 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ You can change things like the parameters, or customize the preprompt to better
155155

156156
By default the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
157157

158-
However, these templates can be modified by setting the `chatPromptTemplate`, `webSearchSummaryPromptTemplate`, and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is https://handlebarsjs.com. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
158+
However, these templates can be modified by setting the `chatPromptTemplate` and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is https://handlebarsjs.com. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
159159

160160
For example:
161161

@@ -187,33 +187,14 @@ The following is the default `chatPromptTemplate`, although newlines and indenti
187187

188188
When performing a websearch, the search query is constructed using the `webSearchQueryPromptTemplate` template. It is recommended that that the prompt instructs the chat model to only return a few keywords.
189189

190-
The following is the default `webSearchQueryPromptTemplate`. Note that not all models supports consecutive user-messages which this template uses.
190+
The following is the default `webSearchQueryPromptTemplate`.
191191

192192
```
193193
{{userMessageToken}}
194-
The following messages were written by a user, trying to answer a question.
194+
My question is: {{message.content}}.
195+
Based on the conversation history (my previous questions are: {{previousMessages}}), give me an appropriate query to answer my question for google search. You should not say more than query. You should not say any words except the query. For the context, today is {{currentDate}}
195196
{{userMessageEndToken}}
196-
{{#each messages}}
197-
{{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
198-
{{/each}}
199-
{{userMessageToken}}
200-
What plain-text english sentence would you input into Google to answer the last question? Answer with a short (10 words max) simple sentence.
201-
{{userMessageEndToken}}
202-
{{assistantMessageToken}}Query:
203-
```
204-
205-
**webSearchSummaryPromptTemplate**
206-
207-
The search-engine response (`answer`) is summarized using the following prompt template. However, when `HF_ACCESS_TOKEN` is provided, a dedicated summary model is used instead. Additionally, the model's `query` response to `webSearchQueryPromptTemplate` is also available to this template.
208-
209-
The following is the default `webSearchSummaryPromptTemplate`. Note that not all models supports consecutive user-messages which this template uses.
210-
211-
```
212-
{{userMessageToken}}{{answer}}{{userMessageEndToken}}
213-
{{userMessageToken}}
214-
The text above should be summarized to best answer the query: {{query}}.
215-
{{userMessageEndToken}}
216-
{{assistantMessageToken}}Summary:
197+
{{assistantMessageToken}}
217198
```
218199

219200
#### Running your own models using a custom endpoint

0 commit comments

Comments
 (0)