Skip to content

Conversation

robertjdominguez
Copy link
Collaborator

Description

  • Optimized DocsBot response time by prioritizing existing conversation artifacts over fresh documentation searches
  • Added mandatory context checking to prevent unnecessary embedding API calls
  • Implemented two-tier workflow that only fetches new documentation when existing context is insufficient

Previously, the bot always performed expensive embedding searches regardless of conversation history:

## Response Workflow
1. Transform user query into embedding
2. Find 3-5 most relevant chunks
3. Extract the single most direct answer path
4. Provide minimal response that solves the immediate problem
5. Include relevant documentation link

Now, the bot checks existing artifacts first, dramatically reducing response latency for follow-up questions:

### Step 1: Check Existing Context First (MANDATORY)
Before fetching new documentation:
1. **ALWAYS check if any existing artifacts contain information relevant to the user's question**
2. **ALWAYS examine the actual content of existing artifacts, not just their titles or descriptions**
3. **If existing artifacts contain ANY potentially relevant information, extract and verify it before responding**
4. **Only proceed to Step 2 if existing context is completely insufficient or the question requires new information**

### Step 2: Fetch New Context (Only When Needed)
When existing artifacts don't contain sufficient information:
1. Transform the user's query into embedding using transform_query_into_embedding
2. Use the app_embeddings_vector_distance function to find the 10 most relevant documentation chunks

This performance optimization is particularly impactful for multi-turn conversations where users ask clarifying questions about topics already covered. Instead of re-running the full embedding pipeline (transform_query_into_embeddingembeddings_vector_distance → content retrieval), the bot can instantly reference previously fetched documentation artifacts.

The change also includes hallucination prevention protocols that ensure accuracy while maintaining the speed benefits - the bot must verify technical details against existing context before making assumptions, but this verification happens against local artifacts rather than requiring new API calls.

Copy link

github-actions bot commented Jul 26, 2025

🚀 PromptQL Build Complete

Build Version: 0b40642aa8
Project: pql-docs
PromptQL Playground: Open Playground

Description: PR #9: PQL: Improve response performance by shortcutting follow-up questions

@robertjdominguez robertjdominguez merged commit d5627d0 into main Jul 26, 2025
1 check passed
@robertjdominguez robertjdominguez deleted the rob/pql/improve-response-performance branch July 26, 2025 15:03
Copy link

✅ PromptQL Build Applied

Build Version: 0b40642aa8
Status: Successfully applied to production
Applied at: 2025-07-26T15:04:19.049Z

robertjdominguez added a commit that referenced this pull request Aug 10, 2025
…#9)

## Description

- Optimized DocsBot response time by prioritizing existing conversation
artifacts over fresh documentation searches
- Added mandatory context checking to prevent unnecessary embedding API
calls
- Implemented two-tier workflow that only fetches new documentation when
existing context is insufficient

Previously, the bot always performed expensive embedding searches
regardless of conversation history:

````yaml path=pql/globals/metadata/promptql-config.hml mode=EXCERPT
## Response Workflow
1. Transform user query into embedding
2. Find 3-5 most relevant chunks
3. Extract the single most direct answer path
4. Provide minimal response that solves the immediate problem
5. Include relevant documentation link
````

Now, the bot checks existing artifacts first, dramatically reducing
response latency for follow-up questions:

````yaml path=pql/globals/metadata/promptql-config.hml mode=EXCERPT
### Step 1: Check Existing Context First (MANDATORY)
Before fetching new documentation:
1. **ALWAYS check if any existing artifacts contain information relevant to the user's question**
2. **ALWAYS examine the actual content of existing artifacts, not just their titles or descriptions**
3. **If existing artifacts contain ANY potentially relevant information, extract and verify it before responding**
4. **Only proceed to Step 2 if existing context is completely insufficient or the question requires new information**

### Step 2: Fetch New Context (Only When Needed)
When existing artifacts don't contain sufficient information:
1. Transform the user's query into embedding using transform_query_into_embedding
2. Use the app_embeddings_vector_distance function to find the 10 most relevant documentation chunks
````

This performance optimization is particularly impactful for multi-turn
conversations where users ask clarifying questions about topics already
covered. Instead of re-running the full embedding pipeline
(`transform_query_into_embedding` → `embeddings_vector_distance` →
content retrieval), the bot can instantly reference previously fetched
documentation artifacts.

The change also includes hallucination prevention protocols that ensure
accuracy while maintaining the speed benefits - the bot must verify
technical details against existing context before making assumptions,
but this verification happens against local artifacts rather than
requiring new API calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant