supports HPU double quant #1630

rsshaik1 · 2025-05-09T05:15:49Z

This PR integrates the support for double dequantization on Gaudi (HPU).

github-actions · 2025-05-12T19:13:27Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

rsshaik1 · 2025-05-20T11:58:06Z

Hi @matthewdouglas, just checking in the status - do we have an expected timeline for merging this PR. Also, could you please let me know if there are any plans to merge it soon

matthewdouglas · 2025-05-20T13:49:55Z

Hi @rsshaik1,

I'll go ahead and merge this here. However, please see #1596 for an update on our plans with this branch.

In short, we're going to stop development on the multi-backend-refactor branch and move towards implementing additional devices on main using a newer dispatch interface based on custom operators in PyTorch.

vivekgoe · 2025-05-21T05:56:04Z

@matthewdouglas Thanks for your support in accepting Intel Gaudi (HPU) related PRs. I have a few questions regarding plans for additional devices support on "main",

Do you have a timeline for stopping development on "multi-backend-refactor" branch and moving to "main"?
Does it make sense for us to start adding Gaudi (HPU) changes on top of PyTorch Custom Operator Integration #1544 ? Or are there any other tasks pending with respect to Custom operators and multi-backend support that we should wait for.

matthewdouglas · 2025-05-21T12:52:44Z

Hi @vivekgoe @rsshaik1

We're still working on timelines for Intel hardware support with @kding1, so it may be best to reach out to him directly to align on goals related to that. Otherwise what I can say is we've stopped working on the multi-backend-refactor branch for Intel CPUs and GPUs already and have new PRs for the main branch in #1628 and #1629.

We're going to keep the multi-backend-refactor wheels available for a while as there's likely users depending on them, but in #1644 I'm now updating the documentation to indicate that it's being deprecated.

In the next few days we will be pushing a v0.46.0 release, and then start merging new device support PRs on main for a target v0.47.0 release. It's hard to give any timelines on a stable v0.47.0 release, but at that point we'll still be building preview wheels for v0.47.0.dev0, so I think it would make sense to start building toward the custom ops implementation.

From what I can tell there's actually significant overlap with the work that @jiqing-feng is doing in #1628 as many of the op implementations for HPU appear to be implemented simply in PyTorch, the same as CPU/XPU. Many of those plain PyTorch ops are being registered now to the "default" dispatch key, meaning they would be used as the implementation for any device that does not override with its own implementation. I think the main change needed would be to register the ops for HPU that wrap around torch.ops.hpu or have other specialization needs.

We're coordinating on Intel development in a Slack channel also; if you feel that's appropriate we could invite Habana stakeholders there.

cc: @Titus-von-Koeller @christoph-koehncke

vivekgoe · 2025-05-21T14:43:13Z

@matthewdouglas Thanks for providing detailed information. You are right, for HPU we are ok with "default" for most ops, only op we need to register separately for HPU is dequantize_4bit. We will work on creating a PR to bring HPU changes to "main" branch.
If you have Slack channel admin rights, then please invite me "vivek.goel@intel.com". If it is a Slack channel created by Intel, then please let me know, I will work with @kding1 to add me. It will help in keeping HPU related code up to date with code for other Intel devices.

Titus-von-Koeller · 2025-05-21T17:53:58Z

@vivekgoe invite to Slack channel just went out

supports HPU double quant

8f9edbd

rsshaik1 marked this pull request as ready for review May 9, 2025 05:32

matthewdouglas added Intel Cross Platform labels May 9, 2025

matthewdouglas merged commit c3eac42 into bitsandbytes-foundation:multi-backend-refactor May 20, 2025
1 of 2 checks passed

matthewdouglas added the Gaudi label Jun 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

supports HPU double quant #1630

supports HPU double quant #1630

Uh oh!

rsshaik1 commented May 9, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 12, 2025

Uh oh!

rsshaik1 commented May 20, 2025

Uh oh!

matthewdouglas commented May 20, 2025

Uh oh!

Uh oh!

vivekgoe commented May 21, 2025

Uh oh!

matthewdouglas commented May 21, 2025

Uh oh!

vivekgoe commented May 21, 2025

Uh oh!

Titus-von-Koeller commented May 21, 2025

Uh oh!

Uh oh!

supports HPU double quant #1630

supports HPU double quant #1630

Uh oh!

Conversation

rsshaik1 commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 12, 2025

Uh oh!

rsshaik1 commented May 20, 2025

Uh oh!

matthewdouglas commented May 20, 2025

Uh oh!

Uh oh!

vivekgoe commented May 21, 2025

Uh oh!

matthewdouglas commented May 21, 2025

Uh oh!

vivekgoe commented May 21, 2025

Uh oh!

Titus-von-Koeller commented May 21, 2025

Uh oh!

Uh oh!

rsshaik1 commented May 9, 2025 •

edited

Loading