Support for quantization of convolutional layers

### Feature request

Would be great if the library could support more types of layers for quantization like with `torch.ao`, seems like it is already available with [conv2d](https://pytorch.org/docs/stable/generated/torch.ao.nn.quantized.Conv2d.html). Sadly `torch.ao` does not seem to support CUDA as a backend right now. Would it be possible to implement the 8-bit and 4-bit kernels in Triton or CUDA to allow for the quantization of convolutional layers? A similar issue has been raised [here](https://github.com/bitsandbytes-foundation/bitsandbytes/issues/29) earlier.

### Motivation

Make modules that use convolutional layers use less memory through quantization.

### Your contribution

Yes, I am willing to work on implementing convolutional kernels if it is possible to integrate with this library.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for quantization of convolutional layers #1414

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for quantization of convolutional layers #1414

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions