Supporting inference with EETQ quantized model

### Feature request

EETQ quantized model perform with very good quality in my case, but the loading is pretty slow. So that if the base model is quantized with EETQ already, LoRAX should load it directly without the JIT quantization, but currently will failed to find related layers.

### Motivation

Speed up the EETQ model loading speed.

### Your contribution

I will prepare a PR for a review, also I need some help with the implementation in someplace.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Supporting inference with EETQ quantized model #391

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Supporting inference with EETQ quantized model #391

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions