📚 Paper (comming soon) | 🤗 HuggingFace Collection |
💬 Discussions Page | 📘 IBM Granite Docs & Granite Cookbooks
Granite 3.3 language models are lightweight, state-of-the-art, open foundation models that natively support multilinguality, coding, reasoning, and tool usage, including the potential to be run on constrained compute resources. All the models are publicly released under an Apache 2.0 license for both research and commercial use. The models' data curation and training procedure were designed for enterprise usage and customization, with a process that evaluates datasets for governance, risk and compliance (GRC) criteria, in addition to IBM's standard data clearance process and document quality checks.
Granite 3.3 models retain key capabilities of earlier versions, such as a long context support, and features to control the response length and originality through annotations. Additionally, the models introduce fill-in-the-middle (FIM) support for code completion and improve the clarity of model reasoning by separating intermediate thoughts from final answers. Granite 3.3 models are available in two different sizes and are built on a dense architecture.
Granite 3.3 was trained on synthetic data generated from a variety of different open source LLMs, including but not limited to open source models like Mistral and Gemma. Gemma is provided under and subject to the Gemma Terms of Use found a https://ai.google.dev/gemma/terms
. A detailed attribution of datasets can be found in the author list.
We release base model — checkpoints of models after pretraining, as well as instruct checkpoints — models finetuned for dialogue, instruction-following, helpfulness, and safety. Comprehensive evaluation results for all model variants, as well as other relevant information will be available in Granite 3.3 Language Model Cards.
To use any of our models, pick an appropriate model_path
from:
ibm-granite/granite-3.3-2b-base
ibm-granite/granite-3.3-2b-instruct
ibm-granite/granite-3.3-8b-base
ibm-granite/granite-3.3-8b-instruct
This is a simple example of how to use Granite-3.1-1B-A400M-Instruct model.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "auto"
model_path = "ibm-granite/granite-3.1-8b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
chat = [
{ "role": "user", "content": "What is the largest ocean on Earth?" },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
# generate output tokens
output = model.generate(**input_tokens,
max_new_tokens=100)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# print output
print(output)
The model of choice (ibm-granite/granite-3.1-8b-instruct
in this example) can be cloned using:
git clone https://huggingface.co/ibm-granite/ibm-granite/granite-3.1-8b-instruct
Plese check our Guidelines and Code of Conduct to contribute to our project.
The model cards for each model variant are available in their respective HuggingFace repository. Please visit our collection here.
All Granite 3.0 Language Models are distributed under Apache 2.0 license.
Please let us know your comments about our family of language models by visiting our collection. Select the repository of the model you would like to provide feedback about. Then, go to Community tab, and click on New discussion. Alternatively, you can also post any questions/comments on our github discussions page.