DBRX Support

### Feature request

Support for DBRX Instruct model in  bitsandbytes

### Motivation

DBRX Instruct is supposed to be the best open LLM model, but the 132B makes it unusable for most. I tried this 

```python
from transformers import BitsAndBytesConfig
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "/home/maziyar/.cache/huggingface/hub/models--databricks--dbrx-instruct/"

nf4_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)

model_nf4 = AutoModelForCausalLM.from_pretrained(
    model_id, 
    quantization_config=nf4_config,
    device_map="auto",
    trust_remote_code=True,
)
```

But it loads the model fully. (maybe I am missing something)

### Your contribution

I am willing to test any PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DBRX Support #1155

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DBRX Support #1155

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions