inference options #30

adampingel · 2025-03-20T16:17:42Z

adampingel
Mar 20, 2025
Maintainer

In the initial few weeks of getting this repo to a point where it's stable enough that we can start to expand the team behind it, I have been dancing around the topic of inference provider and model. I had a discussion with @rawkintrevo earlier and will capture some of that here.

Up until now, for inference I've been bouncing between using Llama and Granite on Ollama, using OpenAI, and a dash of Anthropic. [This has really informed my understanding of how hard it is to get consistent behavior among these different offerings.] All of that has been done via AI Suite.

That is not ideal for 1) automated testing, and 2) demos (notebook or otherwise).

A couple of weeks ago, I thought to try using Llama on AWS. This is supported by AI Suite. However:

I seemingly needed provisioned throughput, which confused me. I didn't work through that, but assume I could easily enough.
The bigger issue is setup complexity. I haven't tried minimizing the workflow, but I am skeptical that AWS setup will be either as available or as easy/fast as it needs to be in order to reach the big audience that we want to.

Together.AI is an AI Alliance member that hosts the Llama models and (currently) provides $1 of inference free after their (very fast) signup. A brief look at that last night resulted in the second half of a tool call failing. But the first half succeeded.

I'm going to try to pare back that experiment and enumerate all my assumptions until I have tool calling on Together with Llama 3.3 working. Then I will try to re-introduce Gofannon into the setup. If that works, I'll then try to put it in the context of this project.

adampingel · 2025-03-20T20:48:14Z

adampingel
Mar 20, 2025
Maintainer Author

One line fix (removal): 3ce0d24#diff-23ffa07c88b7852e7f2393907a0f2ef97c52e4d0ce8a5a67af62b2c075bf6615L198

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

inference options #30

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

inference options #30

Uh oh!

adampingel Mar 20, 2025 Maintainer

Replies: 1 comment

Uh oh!

adampingel Mar 20, 2025 Maintainer Author

adampingel
Mar 20, 2025
Maintainer

adampingel
Mar 20, 2025
Maintainer Author