-
Notifications
You must be signed in to change notification settings - Fork 689
[ET-VK] Creating get_symmetric_quantization_config #12573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Context Eventually dynamic quantization will be enabled in the vulkan_quantizer (particularly 8bit dyn act with 8bit weights). In order to enable this functionality we need to utilize a similar method as XNNPack with how they define their quantization config. This diff aims to align with XNNPack quantizer logic and also migrate away from utilizing the old static quantization config logic. # Changes A few noticable changes is that we migrate away from `get_linear_weight_only_qcs_xnn_qconfig`, and we now define a symmetric config that has parameters to define whether it's dynamically quantized or not. Furthermore, we also incorporate bits_to_range so that we can automatically designate the min and max quant ranges without having to set them during initialization. We also change some wording from using just static as we are now enabling dynamic quantization as well. Furthermore, we change internally other codebases that are calling our existing legacy config, and move them into the more universal symmetric config. Since this follows the same naming scheme as XNNPack, I have decided to just add aliases in cases where its being imported directly along with XNNPack. Differential Revision: [D78291249](https://our.internmc.facebook.com/intern/diff/D78291249/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12573
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit ec19b27 with merge base b6b7a16 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D78291249 |
This PR needs a
|
# Context Eventually dynamic quantization will be enabled in the vulkan_quantizer (particularly 8bit dyn act with 8bit weights). In order to enable this functionality we need to utilize a similar method as XNNPack with how they define their quantization config. This diff aims to align with XNNPack quantizer logic and also migrate away from utilizing the old static quantization config logic. # Changes A few noticable changes is that we migrate away from `get_linear_weight_only_qcs_xnn_qconfig`, and we now define a symmetric config that has parameters to define whether it's dynamically quantized or not. Furthermore, we also incorporate bits_to_range so that we can automatically designate the min and max quant ranges without having to set them during initialization. We also change some wording from using just static as we are now enabling dynamic quantization as well. Furthermore, we change internally other codebases that are calling our existing legacy config, and move them into the more universal symmetric config. Since this follows the same naming scheme as XNNPack, I have decided to just add aliases in cases where its being imported directly along with XNNPack. Differential Revision: [D78291249](https://our.internmc.facebook.com/intern/diff/D78291249/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D78291249 |
# Context Eventually dynamic quantization will be enabled in the vulkan_quantizer (particularly 8bit dyn act with 8bit weights). In order to enable this functionality we need to utilize a similar method as XNNPack with how they define their quantization config. This diff aims to align with XNNPack quantizer logic and also migrate away from utilizing the old static quantization config logic. # Changes A few noticable changes is that we migrate away from `get_linear_weight_only_qcs_xnn_qconfig`, and we now define a symmetric config that has parameters to define whether it's dynamically quantized or not. Furthermore, we also incorporate bits_to_range so that we can automatically designate the min and max quant ranges without having to set them during initialization. We also change some wording from using just static as we are now enabling dynamic quantization as well. Furthermore, we change internally other codebases that are calling our existing legacy config, and move them into the more universal symmetric config. Since this follows the same naming scheme as XNNPack, I have decided to just add aliases in cases where its being imported directly along with XNNPack. Differential Revision: [D78291249](https://our.internmc.facebook.com/intern/diff/D78291249/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D78291249 |
# Context Eventually dynamic quantization will be enabled in the vulkan_quantizer (particularly 8bit dyn act with 8bit weights). In order to enable this functionality we need to utilize a similar method as XNNPack with how they define their quantization config. This diff aims to align with XNNPack quantizer logic and also migrate away from utilizing the old static quantization config logic. # Changes A few noticable changes is that we migrate away from `get_linear_weight_only_qcs_xnn_qconfig`, and we now define a symmetric config that has parameters to define whether it's dynamically quantized or not. Furthermore, we also incorporate bits_to_range so that we can automatically designate the min and max quant ranges without having to set them during initialization. We also change some wording from using just static as we are now enabling dynamic quantization as well. Furthermore, we change internally other codebases that are calling our existing legacy config, and move them into the more universal symmetric config. Since this follows the same naming scheme as XNNPack, I have decided to just add aliases in cases where its being imported directly along with XNNPack. Differential Revision: [D78291249](https://our.internmc.facebook.com/intern/diff/D78291249/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D78291249 |
# Context Eventually dynamic quantization will be enabled in the vulkan_quantizer (particularly 8bit dyn act with 8bit weights). In order to enable this functionality we need to utilize a similar method as XNNPack with how they define their quantization config. This diff aims to align with XNNPack quantizer logic and also migrate away from utilizing the old static quantization config logic. # Changes A few noticable changes is that we migrate away from `get_linear_weight_only_qcs_xnn_qconfig`, and we now define a symmetric config that has parameters to define whether it's dynamically quantized or not. Furthermore, we also incorporate bits_to_range so that we can automatically designate the min and max quant ranges without having to set them during initialization. We also change some wording from using just static as we are now enabling dynamic quantization as well. Furthermore, we change internally other codebases that are calling our existing legacy config, and move them into the more universal symmetric config. Since this follows the same naming scheme as XNNPack, I have decided to just add aliases in cases where its being imported directly along with XNNPack. Differential Revision: [D78291249](https://our.internmc.facebook.com/intern/diff/D78291249/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D78291249 |
59122e3
into
gh/ahmtox/41/base
This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: #12573 by @ahmtox ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/ahmtox/41/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/41/head Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/ahmtox/40/orig Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/41/orig @diff-train-skip-merge --------- Co-authored-by: morelos <morelos@devvm4573.ash0.facebook.com> Co-authored-by: ahmtox <69552192+ahmtox@users.noreply.github.com> Co-authored-by: Gasoonjia <gasoonjia@meta.com>
This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: pytorch#12573 by @ahmtox ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/ahmtox/41/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/41/head Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/ahmtox/40/orig Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/41/orig @diff-train-skip-merge --------- Co-authored-by: morelos <morelos@devvm4573.ash0.facebook.com> Co-authored-by: ahmtox <69552192+ahmtox@users.noreply.github.com> Co-authored-by: Gasoonjia <gasoonjia@meta.com>
Stack from ghstack (oldest at bottom):
Context
Eventually dynamic quantization will be enabled in the vulkan_quantizer (particularly 8bit dyn act with 8bit weights). In order to enable this functionality we need to utilize a similar method as XNNPack with how they define their quantization config. This diff aims to align with XNNPack quantizer logic and also migrate away from utilizing the old static quantization config logic.
Changes
A few noticable changes is that we migrate away from
get_linear_weight_only_qcs_xnn_qconfig
, and we now define a symmetric config that has parameters to define whether it's dynamically quantized or not. Furthermore, we also incorporate bits_to_range so that we can automatically designate the min and max quant ranges without having to set them during initialization. We also change some wording from using just static as we are now enabling dynamic quantization as well.Furthermore, we change internally other codebases that are calling our existing legacy config, and move them into the more universal symmetric config. Since this follows the same naming scheme as XNNPack, I have decided to just add aliases in cases where its being imported directly along with XNNPack.
Differential Revision: D78291249