Handling Tool Calls with Custom LLMs and Non-Streaming Responses #16249
-
I have developed a custom LLM following the LLaMA architecture, which is intended to interact with a newly created agent. However, I’ve encountered an issue: tool calls are only functioning correctly when using OpenAI models such as GPT-4o. I need suggestions on how to properly handle tool calls when using custom models—such as those deployed on AWS. Additionally, I have a question regarding the OpenAI model request and response streaming mechanism. Is the communication handled via WebSockets or Server-Sent Events (SSE)? Furthermore, when working with a custom model that returns a static response (i.e., without streaming), is it feasible to process that response to programmatically invoke a tool call? For example, can I trigger actions such as "Apply All" or "Discard" on a changeset by parsing the static response and calling the corresponding function (e.g., changeSet_writeChangeToFile)? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
@sdirix Hii, I am currently working on invoking the changeSet_writeChangeToFile function from within my language model class. Below is the returned structure:
![]() I would appreciate your assistance in resolving this issue so that the changeSet_writeChangeToFile function completes execution successfully. The goal is to ensure that the file is saved properly and that the associated user actions—such as Apply All, Discard, etc.—are displayed and fully functional. Please let me know if you need any additional details to help troubleshoot this issue. Thank you! |
Beta Was this translation helpful? Give feedback.
-
Please find below the implementation of the language model class, as well as the prompt template used for Coder. Context RetrievalUse the following functions to interact with the workspace files if you require context:
File ValidationUse the following function to retrieve a list of problems in a file if the user requests fixes in a given file:
Propose Code ChangesTo propose code changes or any file changes to the user, never print code or new file content in your response. Instead, for each file you want to propose changes for:
Additional ContextThe following files have been provided for additional context. Some of them may also be referred to by the user. Always look at the relevant files to understand your task using the function ~{getFileContent} Previously Proposed ChangesYou have previously proposed changes for the following files. Some suggestions may have been accepted by the user, while others may still be pending. Language Model Class Implementation
` |
Beta Was this translation helpful? Give feedback.
I see that you are using the
OpenAI
API. As we already have a LanguageModel implementation for OpenAI you should not need to implement your own version, instead you can reuse the existing one.However the current one has a shortcoming which you might already have ran into: The non-streaming request implementation does not support tool calls yet. So if you need this support, then at the moment you need to implement it yourself. I would like to suggest copying the current OpenAI LanguageModel and modifying the non-streaming request code.
As @eneufeld indicated, the best place to see how to do this is the Ollama LanguageModel.
If you manage to do so, it would be great if you could contribute…