Add `ChatContextOptions` to `ChatOptions` #347

austin-denoble · 2025-05-30T16:49:51Z

Problem

There was a change made to the Assistant chat interface in the last release that were not captured in the TypeScript client - topK and snippetSize have been moved into a context_options value: https://docs.pinecone.io/reference/api/2025-04/assistant/chat_assistant#body-context-options. The client currently has topK at the top level of the ChatOptions payload.

While cleaning this up I also noticed a few inconsistencies with how some things are being passed to the Assistant API which I've also addressed here. There's some added complexity in the TypeScript implementation for Assistant as the generated code did not fully cover implementation for streaming and file upload, so chatStream, chatCompletionStream, and uploadFile all use fetch directly rather than plumbing through generated code.

This looks to address this community post: https://community.pinecone.io/t/pinecone-assistant-chatstream-topk-and-snippet-size/8065/1

Solution

Add and export ChatContextOptions as a new type which wraps topK and snippetSize. To keep this somewhat non-breaking, I've left the topK field at the top of the ChatOptions interface as well. There's a bit of logic to use this if present, and otherwise use contextOptions.topK. I wanted to keep things non-breaking so we could release this as a minor upgrade.
In the chatStream, chatCompletionStream functions we need to manually convert the keys of the passed objects to snake case. The API is expecting snakecase, and normally the generated OpenAPI types handle this for us. Since we're not using those directly in these cases, we need to handle this ourselves. I had missed this previously, although I did handle converting responses from snake to camel where necessary. I know this is a bit confusing - apologies.
The context function wasn't passing messages, topK, or snippetSize properly. This was just a miss on my part.

We need better testing coverage in general for assistant operations. The implementation was a bit rushed on my part earlier this year.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
Infrastructure change (CI configs, etc)
Non-code change (docs, etc)
None of the above: (explain here)

Test Plan

CI - external app tests, unit tests, integration tests

Manually tested using the assistant interface via repl. You can pull this branch down yourself and play around using npm run repl locally:

npm run repl
await init()

await client.createAssistant({ name: 'test-assistant' })

# you'll need to use a local path
await client.Assistant('test-assistant').uploadFile({ path: '/Users/austin/Downloads/A_Primer_on_Memory_Consistency_and_Cache_Coherence-2nd-Edition.pdf', metadata: { genre: 'classical', meat: 'cute' }})

# wait for file to process, etc

# non-streaming chat, contextOptions
await client.assistant('test-assistant').chat({ messages: [{ role: 'user', content: 'tell me a few basics about caching'}], model: 'claude-3-5-sonnet', filter: { genre: 'classical'}, jsonResponse: true, includeHighlights: true, contextOptions: {topK:20, snippetSize: 584 }})

# non-streaming chat, topK only
await client.assistant('test-assistant').chat({ messages: [{ role: 'user', content: 'tell me a few basics about caching'}], model: 'claude-3-5-sonnet', filter: { genre: 'classical'}, jsonResponse: true, includeHighlights: true, topK: 20})

# streaming chat, contextOptions
const chatStream = await client.assistant('test-assistant').chatStream({ messages: [{ role: 'user', content: 'tell me a few basics about caching'}], model: 'claude-3-5-sonnet', filter: { genre: 'classical'}, includeHighlights: true, contextOptions: {topK:20, snippetSize: 584 }})
for await (const chunk of chatStream) { console.log(chunk) }

# streaming chat, contextOptions
const chatStream = await client.assistant('test-assistant').chatStream({ messages: [{ role: 'user', content: 'tell me a few basics about caching'}], model: 'claude-3-5-sonnet', filter: { genre: 'classical'}, includeHighlights: true, topK: 20})
for await (const chunk of chatStream) { console.log(chunk) }

Here are some examples with PINECONE_DEBUG output from my local runs:

> await client.assistant('test-assistant').chat({ messages: [{ role: 'user', content: 'tell me a few basics about caching'}], model: 'claude-3-5-sonnet', filter: { genre: 'classical'}, jsonResponse: true, includeHighlights: true, contextOptions: {topK:20, snippetSize: 584 }})
>>> Request: GET https://api.pinecone.io/assistant/assistants/test-assistant
>>> Headers: {"User-Agent":"@pinecone-database/pinecone v6.0.1; lang=typescript; node v18.17.1","X-Pinecone-Api-Version":"2025-04","Api-Key":"***REDACTED***"}

<<< Status: 200
<<< Body: {"name":"test-assistant","instructions":null,"metadata":null,"status":"Ready","host":"https://prod-1-data.ke.pinecone.io","created_at":"2025-05-30T16:51:21.987343597Z","updated_at":"2025-05-30T16:51:22.374927670Z"}

curl -X GET https://api.pinecone.io/assistant/assistants/test-assistant -H "Api-Key: ****REDACTED***" 

>>> Request: POST https://prod-1-data.ke.pinecone.io/assistant/chat/test-assistant
>>> Headers: {"User-Agent":"@pinecone-database/pinecone v6.0.1; lang=typescript; node v18.17.1","X-Pinecone-Api-Version":"2025-04","Content-Type":"application/json","Api-Key":"***REDACTED***"}
>>> Body: {"messages":[{"role":"user","content":"tell me a few basics about caching"}],"stream":false,"model":"claude-3-5-sonnet","filter":{"genre":"classical"},"json_response":true,"include_highlights":true,"context_options":{"top_k":20,"snippet_size":584}}

<<< Status: 200
<<< Body: {"finish_reason":"stop","message":{"role":"assistant","content":"{\n  \"basics\": [\n ... ETC }

curl -X POST https://prod-1-data.ke.pinecone.io/assistant/chat/test-assistant -H "Api-Key: ***REDACTED***" -H "Content-Type: application/json" -d '{"messages":[{"role":"user","content":"tell me a few basics about caching"}],"stream":false,"model":"claude-3-5-sonnet","filter":{"genre":"classical"},"json_response":true,"include_highlights":true,"context_options":{"top_k":20,"snippet_size":584}}'

{
  id: '00000000000000005ec73ab3e9e573ad',
  finishReason: 'stop',
  message: {
    role: 'assistant',
    content: '{\n' +
      '  "basics": [\n' +
      '    "Caches are used to reduce average latencies to access storage structures.",\n' +
      '    "A typical system model includes a multicore processor chip with private data caches for each core and a shared last-level cache (LLC).",\n' +
      '    "Cache coherence is needed to maintain consistency between multiple cached copies of data.",\n' +
      '    "The granularity of coherence is usually maintained at the level of cache blocks, rather than individual bytes.",\n' +
      '    "Common cache states include Modified (M), Shared (S), and Invalid (I), which are part of the MSI protocol."\n' +
      '  ]\n' +
      '}'
  },
  model: 'arn:aws:bedrock:us-east-1::inference-profile/us.anthropic.claude-3-5-sonnet-20240620-v1:0',
  citations: [
    { position: 93, references: [Array] },
    { position: 235, references: [Array] },
    { position: 332, references: [Array] },
    { position: 450, references: [Array] },
    { position: 564, references: [Array] }
  ],
  usage: { promptTokens: 13795, completionTokens: 189, totalTokens: 13984 }
}

>  await client.assistant('test-assistant').chat({ messages: [{ role: 'user', content: 'tell me a few basics about caching'}], model: 'claude-3-5-sonnet', filter: { genre: 'classical'}, jsonResponse: true, includeHighlights: true, topK: 20})
>>> Request: GET https://api.pinecone.io/assistant/assistants/test-assistant
>>> Headers: {"User-Agent":"@pinecone-database/pinecone v6.0.1; lang=typescript; node v18.17.1","X-Pinecone-Api-Version":"2025-04","Api-Key":"***REDACTED***"}

<<< Status: 200
<<< Body: {"name":"test-assistant","instructions":null,"metadata":null,"status":"Ready","host":"https://prod-1-data.ke.pinecone.io","created_at":"2025-05-30T16:51:21.987343597Z","updated_at":"2025-05-30T16:51:22.374927670Z"}

curl -X GET https://api.pinecone.io/assistant/assistants/test-assistant -H "Api-Key: ***REDACTED***" 

>>> Request: POST https://prod-1-data.ke.pinecone.io/assistant/chat/test-assistant
>>> Headers: {"User-Agent":"@pinecone-database/pinecone v6.0.1; lang=typescript; node v18.17.1","X-Pinecone-Api-Version":"2025-04","Content-Type":"application/json","Api-Key":"***REDACTED***"}
>>> Body: {"messages":[{"role":"user","content":"tell me a few basics about caching"}],"stream":false,"model":"claude-3-5-sonnet","filter":{"genre":"classical"},"json_response":true,"include_highlights":true,"context_options":{"top_k":20}}

<<< Status: 200
<<< Body: {"finish_reason":"stop","message":{"role":"assistant","content":"{\n  \"basics\":...ETC}

curl -X POST https://prod-1-data.ke.pinecone.io/assistant/chat/test-assistant -H "Api-Key: ***REDACTED***" -H "Content-Type: application/json" -d '{"messages":[{"role":"user","content":"tell me a few basics about caching"}],"stream":false,"model":"claude-3-5-sonnet","filter":{"genre":"classical"},"json_response":true,"include_highlights":true,"context_options":{"top_k":20}}'

{
  id: '00000000000000004e4877c14db3faa7',
  finishReason: 'stop',
  message: {
    role: 'assistant',
    content: '{\n' +
      '  "basics": [\n' +
      '    "Caches are used to store recently accessed data for faster retrieval, reducing the need to access slower main memory. A cache contains copies of data from frequently used main memory locations.",\n' +
      '    "There are typically multiple levels of caches in a system, including private level-one (L1) caches for each processor core and a shared last-level cache (LLC).",\n' +
      '    "Caches can be virtually addressed or physically addressed. Most modern systems use physically addressed caches, where the cache is accessed using physical memory addresses.",\n' +
      '    "Cache coherence is needed to ensure that multiple copies of data in different caches remain consistent. Coherence protocols define rules for maintaining consistency between caches.",\n' +
      '    "Two main types of coherence protocols are snooping protocols, which broadcast requests to all caches, and directory protocols, which use a centralized directory to track which caches have copies of data."\n' +
      '  ]\n' +
      '}'
  },
  model: 'arn:aws:bedrock:us-east-1::inference-profile/us.anthropic.claude-3-5-sonnet-20240620-v1:0',
  citations: [
    { position: 137, references: [Array] },
    { position: 380, references: [Array] },
    { position: 446, references: [Array] },
    { position: 560, references: [Array] },
    { position: 671, references: [Array] },
    { position: 748, references: [Array] },
    { position: 959, references: [Array] }
  ],
  usage: { promptTokens: 46851, completionTokens: 286, totalTokens: 47137 }
}

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1210432797777598

…hape, update chat, chatStream, and context to handle sending requests properly

…llow backwards compatibility with the previous interface

austin-denoble added 3 commits May 30, 2025 12:24

add ChatContextOptions type to align with the Chat endpoint payload s…

32ebe48

…hape, update chat, chatStream, and context to handle sending requests properly

keep topK at the top level of ChatOptions and mark as deprecated to a…

1d72e53

…llow backwards compatibility with the previous interface

make sure topK is accepted as a key in ChatOptions

e17f52e

austin-denoble requested review from azzazzel, jhamon and rohanshah18 May 30, 2025 17:48

austin-denoble marked this pull request as ready for review May 30, 2025 17:48

austin-denoble merged commit f752c3a into main May 30, 2025
53 of 54 checks passed

austin-denoble deleted the adenoble/fix-assistant-data-options branch May 30, 2025 21:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `ChatContextOptions` to `ChatOptions` #347

Add `ChatContextOptions` to `ChatOptions` #347

Uh oh!

austin-denoble commented May 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Add ChatContextOptions to ChatOptions #347

Add ChatContextOptions to ChatOptions #347

Uh oh!

Conversation

austin-denoble commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Type of Change

Test Plan

Uh oh!

Uh oh!

Uh oh!

Add `ChatContextOptions` to `ChatOptions` #347

Add `ChatContextOptions` to `ChatOptions` #347

austin-denoble commented May 30, 2025 •

edited

Loading