Implement Real-time Token Estimation in Chat Input

This feature aims to provide users with immediate, real-time feedback on the estimated token usage of their current message, including any integrated GitHub repository content ( see #23 ) . This transparency is crucial for helping users manage LLM costs, adhere to context window limits, and optimize their prompts before sending, especially when incorporating large amounts of external data.

### 1. Location and Visibility

*   The token estimation display will be rendered as a clear, non-intrusive element **directly above the main chat input text area** in the chat view.
*   It should be visible whenever the user is actively typing or when the input area contains text or identified `@git:` mentions.
*   **Display Format:**
    *   `Estimated Tokens: XXXX / YYYY`
    *   `XXXX` represents the current estimated token count.
    *   `YYYY` represents the maximum context window of the currently selected LLM model.
*   **Visual Cues:**
    *   The display should change color or include a warning icon as the `XXXX` value approaches `YYYY` (e.g., yellow for 80% usage, red for 95%+ usage or over limit).
    *   A small tooltip on hover should clarify: "Token count is an estimate based on a common LLM tokenizer. Actual usage may vary slightly depending on the selected model."

### 2. Calculation Logic

The token estimation will be performed i the backend based on the text that is *intended* to be sent to the LLM for the current turn. Chat history token count is already known, therefore only new text/file needs to be estimated.

*   **Content to Tokenize:** The estimated token count will primarily cover:
    1.  The user's **current text input** in the chat message area.
    2.  The aggregated textual content of all **`@git:` mentions** present in the current input (resolved from attached GitHub Nodes or Global Repositories).

*   **Tokenizer:**
    *   The estimation will use the `tiktoken` library with the `cl100k_base` encoding. This encoder is widely used by popular models like OpenAI's GPT-3.5 and GPT-4, providing a reliable baseline estimate.

*   **Backend Support for GitHub Content:**
    *   To enable accurate real-time estimation of `@git:` mentioned content, the backend must provide a mechanism for the frontend to retrieve the *textual content* of mentioned GitHub files *on demand*.
    *   This could be a dedicated API endpoint (e.g., `/api/estimate-context`) that takes a list of `@git:` mentions (repo alias, file path) and returns their concatenated, LLM-ready content. The frontend would then tokenize this concatenated string along with the user's input.

### 3. Update Mechanism (Performance & Responsiveness)

To ensure a smooth user experience and prevent excessive re-calculations on every keystroke:

*   **Debouncing:** The token count update should be **debounced**. This means the calculation only triggers after the user has stopped typing for a short period (e.g., **300-500 milliseconds**).
*   **Throttling (Alternative/Addition):** Alternatively, or in addition to debouncing, consider throttling updates to occur only after a certain number of words have been typed (e.g., every 5 words), though debouncing is usually sufficient for text input.
*   **Initial Load:** The token count should be calculated and displayed immediately when the chat component mounts or when the input area first receives focus/content.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Real-time Token Estimation in Chat Input #24

1. Location and Visibility

2. Calculation Logic

3. Update Mechanism (Performance & Responsiveness)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implement Real-time Token Estimation in Chat Input #24

Description

1. Location and Visibility

2. Calculation Logic

3. Update Mechanism (Performance & Responsiveness)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions