inference options #30
adampingel
started this conversation in
Ideas
Replies: 1 comment
-
One line fix (removal): 3ce0d24#diff-23ffa07c88b7852e7f2393907a0f2ef97c52e4d0ce8a5a67af62b2c075bf6615L198 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
In the initial few weeks of getting this repo to a point where it's stable enough that we can start to expand the team behind it, I have been dancing around the topic of inference provider and model. I had a discussion with @rawkintrevo earlier and will capture some of that here.
Up until now, for inference I've been bouncing between using Llama and Granite on Ollama, using OpenAI, and a dash of Anthropic. [This has really informed my understanding of how hard it is to get consistent behavior among these different offerings.] All of that has been done via AI Suite.
That is not ideal for 1) automated testing, and 2) demos (notebook or otherwise).
A couple of weeks ago, I thought to try using Llama on AWS. This is supported by AI Suite. However:
Together.AI is an AI Alliance member that hosts the Llama models and (currently) provides $1 of inference free after their (very fast) signup. A brief look at that last night resulted in the second half of a tool call failing. But the first half succeeded.
I'm going to try to pare back that experiment and enumerate all my assumptions until I have tool calling on Together with Llama 3.3 working. Then I will try to re-introduce Gofannon into the setup. If that works, I'll then try to put it in the context of this project.
Beta Was this translation helpful? Give feedback.
All reactions