Feature Request: Leverage structured prompt formats and other best practices #546

nissa-seru · 2025-01-25T01:04:06Z

nissa-seru
Jan 25, 2025

Some models (such as Claude-Sonnet) are able to parse structured text formats such as XML in a structured manner akin to how they interact with code in an AST-like fashion. When content is structured such that this is possible, the model displays much-enhanced ability to retain/apply the information and comply with any instructions. Otherwise, models default to parsing content as unstructured text.

I recommend:

Reformat prompt contents as XML
Reorganize prompts so that information is grouped by usage context
Currently, some information relevant to the execute_command tool is spread across other sections
Remove the voluminous and forceful diction asking the model to, ie, use a tool.
For supported models, use the built-in functionality (for example, this can be specified directly as an API option for Claude models)
Regardless of model, state requirements clearly, concretely, and leverage structured formats.
Reorganize prompts such that lengthy context and reference material is provided first, and key instructions and requirements are provided last.
Restructure the overall flow to be a list of sequential operations for the model to execute, not a list of requirements to simultaneously satisfy.

Example for execute_command:

<tool_definition>
    <name>execute_command</name>
    <description>Executes CLI commands in a new VSCode terminal instance. Supports interactive commands and long-running background processes. Terminal output may be partially visible.</description>

    <constraints>
        <working_directory>
            <fixed>${cwd}</fixed>
            <external_operations>Use 'cd target_path && command' for operations outside working directory</external_operations>
            <prohibited>~ character, $HOME references</prohibited>
        </working_directory>
    </constraints>

    <parameters>
        <parameter>
            <name>command</name>
            <required>true</required>
            <description>CLI command to execute</description>
        </parameter>
    </parameters>

    <syntax_template>
<execute_command>
<command>Your command here</command>
</execute_command>
    </syntax_template>

    <example>
<execute_command>
<command>pnpm test</command>
</execute_command>
    </example>

    <example_external>
<execute_command>
<command>cd /path/to/project && pnpm install</command>
</execute_command>
    </example_external>
</tool_definition>

Szpadel · 2025-01-28T09:54:01Z

Szpadel
Jan 28, 2025

I was also thinking about similar improvement, what do you think about using tools api instead for models that support it and fallback to XML in other cases? https://platform.openai.com/docs/assistants/tools/function-calling/quickstart

we could have some abstraction layer that could be serialized to both formats I assume that tools api serializes to best trained format under the hood so I believe this would result in best possible performance

0 replies

nissa-seru · 2025-01-28T12:18:26Z

nissa-seru
Jan 28, 2025
Author

Good thoughts, three notes (dictation): 1. Generally, tools APIs put the tools in the system prompt. While this is probably good for reliability, it does cause providers such as Anthropic that require the system prompt to be constant in order to preserve cache. It causes any change in the available tools to break cache all the way back to the system prompt.such that there is one sense in which it would be nice to retain the flexibility to insert tool definitions further downstream in the chat thread, even though the current implementation does have them in the system prompt. 2. While the tools APIs generally do indeed serialize to the tool definition format that works well for the model, it does provide the overall context instructions to the model in a way that bakes in some assumptions about system prompt length. Specifically, if I recall for Anthropic, it puts some of the tool call instructions prior to the user provided system prompt, and in cases where the user system prompt is very large, this runs the risk of some of the Anthropic input system prompt parts being drowned out, because typically reference material you would put at the top and then put more instruct type or actionable information at the end so that it's more easily followed. That's what Anthropic themselves recommends as well. However, if you have a large system prompt, it's not possible to do that with regards to some of the API instructions, because they're put before the system prompt, and so you risk drowning them out. And so there's one sense in which even if one is going to directly copy the serialization format that the API uses - which I think is generally a good practice, because the models do respond very well to that empirically - there's still an argument for implementing it outside of the tools API so that you preserve the freedom of how you want to organize your system prompt, especially as, if the user provided system prompt is injected below part of the tool instructions, then there's not actually a way to provide reference material up at the top of your system prompt in a way that you avoid drowning out some of the system instructions populated by the API. 3. Just for clarity, there is quite a lot of prompt text, at least in the case of both Roo and Klein, that is not directly related to tools, and such would still need to be ideally formatted in a structured form, regardless of whether the tools API is used.

…

On Tue, Jan 28, 2025 at 4:54 AM Szpadel ***@***.***> wrote: I was also thinking about similar improvement, what do you think about using tools api instead for models that support it and fallback to XML in other cases? https://platform.openai.com/docs/assistants/tools/function-calling/quickstart we could have some abstraction layer that could be serialized to both formats I assume that tools api serializes to best trained format under the hood so I believe this would result in best possible performance — Reply to this email directly, view it on GitHub <#546 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A4NBSEWFFJQA72MQQO5KOJT2M5HU5AVCNFSM6AAAAABV2Y5ZYCVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCOJYGIYTSNQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

1 reply

Szpadel Jan 28, 2025

I understand your concerns, but I believe this would be in line with keeping 2 serialisation outputs.

In case particular model does not work well with tools API, we could easily switch it to xml variant.

I believe that some models are trained to use tools in different format, like JSON or markdown, so forcing it to thinking in xml might decrease success ratio.

Because tool calls are handled outside of prompt selective sampling can force LLM to use correct specified tool parameters.

I would argue that invalidating cache by changing available tools is acceptable compromise, it's not something that happens often.

I believe that we could dramatically reduce context size by deduplicating included file contents and allowing multiple tool calls in single message, that would decrease costs a lot even without using any cache.

In my judgement benefits outweigh downsides in favour of using tools api

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Leverage structured prompt formats and other best practices #546

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Feature Request: Leverage structured prompt formats and other best practices #546

Uh oh!

Uh oh!

nissa-seru Jan 25, 2025

Replies: 2 comments · 1 reply

Uh oh!

Szpadel Jan 28, 2025

Uh oh!

nissa-seru Jan 28, 2025 Author

Uh oh!

Szpadel Jan 28, 2025

nissa-seru
Jan 25, 2025

Replies: 2 comments 1 reply

Szpadel
Jan 28, 2025

nissa-seru
Jan 28, 2025
Author