Open
Description
Feature request
Support for DBRX Instruct model in bitsandbytes
Motivation
DBRX Instruct is supposed to be the best open LLM model, but the 132B makes it unusable for most. I tried this
from transformers import BitsAndBytesConfig
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_id = "/home/maziyar/.cache/huggingface/hub/models--databricks--dbrx-instruct/"
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)
model_nf4 = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=nf4_config,
device_map="auto",
trust_remote_code=True,
)
But it loads the model fully. (maybe I am missing something)
Your contribution
I am willing to test any PR