On-Device Zeta #24859

chrisvander · 2025-02-14T12:39:07Z

chrisvander
Feb 14, 2025

Putting Zeta on servers and relying on data center compute, plus network transfer speed, is expensive in both latency and literal server cost. Why not, for the hardware that can reasonably support it, put Zeta on-device? This would potentially allow for more reliable latency and offline use, in addition to being better for privacy.

Either that, or an option to do completions on-device through Ollama.

It's baffling to me that companies continue to not utilize local compute. I don't want to pay yet another subscription to access tools like this. I paid for a powerful local system - use it!

baldwindavid · 2025-02-14T16:01:38Z

baldwindavid
Feb 14, 2025

I don't really disagree though this is an OS model (as is Zed itself) and Zed does interface with Ollama and LM Studio so maybe that's coming. Other than that, I would very much like to see Zed continue on its own. Presumably at a certain point they'll need to monetize somehow or the logical conclusion is getting acquired by a large company with varied motivations and priorities.

2 replies

chrisvander Feb 14, 2025
Author

Monetization via charging for collaboration features makes the most sense to me. On device completion as part of the open source solution just improves the product experience for a individual developers, who in turn will bring Zed to their teams. Maybe remote completion models are more powerful - that makes sense to charge for.

baldwindavid Feb 14, 2025

Yep, I believe they've publicly stated collaboration and AI being the focus in terms of profitability. I think they've done a good job with transparency and interoperability thus far and have no reason to think that won't continue within the constraints of, uh, survivability.

antran22 · 2025-04-03T14:41:44Z

antran22
Apr 3, 2025

I'm going to chime in here and give some of my perspective.
The current scene of AI model for editor, especially Copilot-like predictive editing is mainly revolving around propriety model and over-the-network communication. Remote LLM may make sense in the case of chat/reasoning models that are hundreds of billion parameters large, but for smaller models, it is already quite feasible to run locally efficiently. This allows me to utilize the feature in an air-gapped environment.

I respect Zed team's decision of releasing Zeta fully opensource & open-data, and I guess that on-device zeta is a direction that they will take in the near future. After all, Zed already allow fully customizable LLM source for the assistant, it would make no sense that they will paywall this feature.

I would love to get the Zed team's input on this.

1 reply

isterin Apr 12, 2025

Same, on device would be nice. I don't mind using remote for personal stuff, but lots of work restrictions about remote LLM models, makes using it not possible. I have local LM Studio and it would be great if I can use it with that.

jeapostrophe · 2025-04-29T18:19:16Z

jeapostrophe
Apr 29, 2025

Similarly, it would be nice to have the inline edit predictions with other models like a fat Qwen 32b model. I can manually make calls myself with Ollama, but I want it integrated into my awesome editor

0 replies

praktiskt · 2025-05-01T11:22:51Z

praktiskt
May 1, 2025

I've been working on a personal project called zedex to which I recently added a crude implementation of edit prediction for Zed using any OpenAI-compatible backend. Some of the bits in the implementation are just winged / didn't really look too close at the Zed implementation but it seems to work OK for the small things I use it for. At the moment is uses some serious prompt engineering (lol) as glue to work well with both Llama 3.3 70B and the newer Llama 4 models. I haven't tried the Zeta fined-tuned model.

At the moment it requires you to give up a lot of other Zed features including (and not limited to) login and collaboration as you have to override server_url in your settings, but perhaps it's useful for other people than me.

0 replies

avdept · 2025-05-07T18:46:34Z

avdept
May 7, 2025

would be nice to get it to work locally. Recently it seem like I hit a limit of free predictions on my free plan and was wondering what's next. I already run qwen2.5 coder locally, and since zeta also based on same model, I think it would be nice to run it all locally

4 replies

phromo May 8, 2025

I've been looking at running Zeta standalone as well. The training data repo of zeta gives some hints on how they prompt the completion. Unfortunately they just removed the environment variable that let you override just that URL, see #30308 -- however zedex is still viable. perhaps one can poke @praktiskt about getting some setting that just proxies the edit prediction request to some other URL (for me who dont dabble in go much). ;)

Mayeu May 22, 2025

@phromo are you able to run local prediction using ZED_PREDICT_EDITS_URL? I tried this approach with llama.cpp but I'm getting a 500 error:

got exception: {"code":500,"message":"[json.exception.out_of_range.403] key 'messages' not found","type":"server_error"}
srv  log_server_r: request: POST /v1/chat/completions 127.0.0.1 500

Can you describe how you made it work if you did? Thanks :)

Edit:
I launched Zed with: ZED_PREDICT_EDITS_URL=http://localhost:8080/v1/chat/completions zed .

phromo May 23, 2025

@Mayeu edit predictions API/URL rely on a non-standardised API (dont think there is any industry standard for wide edit predictions yet, at all, so this is not dramatic). To my best knowledge, you will have to implement your own server for now since zed-industries has not open sourced their cloudflare worker that runs the model. Have a look at https://github.com/praktiskt/zedex/blob/main/src/zed/edit_predict.go and https://github.com/zed-industries/zed/blob/main/crates/zeta/src/zeta.rs for pointers. I never finished my server but will ping and open source it if I do :)

TLDR -- ZED_PREDICT_EDITS_URL does not expect the chat/completions "openai"-style API. It needs something else since the prompt is created serverside and not clientside.

Mayeu May 23, 2025

@phromo I see, thanks for the pointer :) I found out that somebody created a server for the prediction: https://github.com/g0t4/zed-zeta-server, I tried that a bit, but I don't think my laptop is powerful enough for it to be usable anyway x)

formbook · 2025-05-20T00:53:37Z

formbook
May 20, 2025

this is a no brainer, people are not paying 20 a month to get edit predictions from qwen2.5coder 7b, an open source, small and limited model, theyre paying for api access to the latest big time models with agenetic editing

zeta is open source, the community contributed to it, and any half decent hardware can run it locally, whatever it costs to run edit predictions for zed would be practically wiped out

windsurf has separated edit prediction from the main subscription, its unlimited on free tier and its a better model/ux than zeta, although not as good as Cursor Tab, zeta is in 3rd place at the minute for edit prediction and theres just no justification for locking it behind a 20 a month subscription

the people who want chat, agents and collab are going to subscribe regardless of whether zeta can be run locally or not, and people who only use edit prediction are not going to subscribe when better is available elsewhere for free

1 reply

marcochiodo May 21, 2025

I would add that locally a developer could also use qwen2.5coder 32b and have significant advantages.

adamierymenko · 2025-05-29T21:15:11Z

adamierymenko
May 29, 2025

I would pay for Zed to get this feature, so to me it's not a $$$ question it's a flexibility and capability question. I'd like to be able to set up and configure arbitrary local or remote providers as edit predictors with a lot of flexibility.

0 replies

ooJaan · 2025-06-01T19:59:28Z

ooJaan
Jun 1, 2025

+1 would also really appreciate this feature

0 replies

dwt · 2025-06-02T05:53:44Z

dwt
Jun 2, 2025

I would really like to hack on this feature, try other models, see the difference, try their fine tuning and improve on it. Not being able to execute this locally - at least I couldn't yet find a way - makes this really difficult.

0 replies

marcochiodo · 2025-06-02T06:07:24Z

marcochiodo
Jun 2, 2025

This would be a much appreciated feature. Vscode allows code autocomplete via ollama with the continue plugin.

1 reply

raucheacho Jun 2, 2025

same opinion zed is an awesome code editor i switched from vscode but lost my local completion capability and want to go back🫢

sergiomujica · 2025-06-02T22:14:01Z

sergiomujica
Jun 2, 2025

I might pay to have Gemini do my code predictions, not being able to do it is highly motivating to refuse to pay a subscription and move away from zed.

0 replies

On-Device Zeta #24859

Uh oh!

Replies: 11 comments · 9 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chrisvander Feb 14, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 11 comments 9 replies

chrisvander Feb 14, 2025
Author