Shashanka Venkataramanan*, Valentinos Pariza*, Mohammadreza Salehi, Lukas Knobel, Spyros Gidaris, Elias Ramzi, Andrei Bursuc†, Yuki M. Asano†
*: equal contribution †: equal advising
Valeo.ai, Paris; Fundamental AI Lab, UTN; VIS Lab, University of Amsterdam
Welcome to the official codebase for Franca (pronounced Fran-ka), the first fully open-source vision foundation model—including data, code, and pretrained weights.
Franca matches or surpasses the performance of leading proprietary models such as DINOv2, CLIP, and SigLIPv2. Built on a fully transparent training pipeline leveraging publicly available datasets like ImageNet-21K and LAION-600M, Franca advances the state of self-supervised learning (SSL) in vision.
Key contributions include:
-
Nested Matryoshka Clustering: A parameter-efficient, multi-head clustering projector that refines feature representations into increasingly fine-grained clusters without increasing model size. This approach improves performance while reducing memory overhead for downstream applications.
-
RASA (Relative Absolute Spatial Attention): A novel positional disentanglement strategy that explicitly removes positional biases from dense representations, enhancing semantic encoding.
-
CyclicMask: A simple masking strategy with circular shifts that overcomes spatial imbalance in masking augmentations.
Despite training on large-scale, uncurated open-source internet data, Franca demonstrates strong generalization across scales and excels on a wide range of downstream tasks, including: In-context learning with HummingBird benchmark, Out-of-distribution detection, 3D understanding, various image classification benchmarks.
Franca is released as a research project to promote transparency, reproducibility, and broad accessibility in vision foundation models. We aim to establish a new benchmark for open and generalizable AI models that empower the scientific community.
- [July 2025] Official code and pretrained models released! 🔥
model | # of params |
Dataset | Resolution | ImageNet k-NN |
ImageNet linear |
HummingBird VOC |
Linear Segm. ADE20K |
download | download |
---|---|---|---|---|---|---|---|---|---|
ViT-B/14 | 86 M | In21K | 518 | 77.4% | 82.0% | 75.7% | 39.1% | backbone only | RASA head |
ViT-L/14 | 300 M | In21K | 224 | 82.2% | 84.5% | 73.5% | 41.3% | backbone only | RASA head |
ViT-L/14 | 300 M | LAION-600M | 224 | 81.9% | 84.4% | 73.5% | 41.4% | backbone only | RASA head |
ViT-g/14 | 1,100 M | In21k | 224 | 83.0% | 85.9% | 71.7% | 40.2% | part 1, part 2, part 3 | RASA head |
ViT-g/14 | 1,100 M | LAION-600M | 224 | 81.2% | 85.0% | 76.7% | 42.4% | part 1, part 2, part 3 | RASA head |
Please follow the instructions here to install PyTorch (the only required dependency for loading the model). Installing PyTorch with CUDA support is strongly recommended.
import torch
# Franca -- In21k
franca_vitb14 = torch.hub.load('valeoai/Franca', 'franca_vitb14', use_rasa_head=True)
franca_vitl14 = torch.hub.load('valeoai/Franca', 'franca_vitl14', use_rasa_head=True)
franca_vitg14 = torch.hub.load('valeoai/Franca', 'franca_vitg14', use_rasa_head=True)
# Franca -- Laion600M
franca_vitl14 = torch.hub.load('valeoai/Franca', 'franca_vitl14', weights='LAION', use_rasa_head=True)
franca_vitg14 = torch.hub.load('valeoai/Franca', 'franca_vitg14', weights='LAION', use_rasa_head=True)
# Dinov2 baseline -- In21k
franca_vitb14 = torch.hub.load('valeoai/Franca', 'franca_vitb14', weights='DINOV2_IN21K', use_rasa_head=False)
franca_vitl14 = torch.hub.load('valeoai/Franca', 'franca_vitl14', weights='DINOV2_IN21K', use_rasa_head=False)
git clone https://github.com/valeoai/Franca.git
cd Franca
# To install Franca and RASA you can use the following command
pip install -e ".[franca]"
# To install RASA seperately you can use the following command
pip install -e .
We recommend install torch separately to match your specific config. Similarly, Franca relies on xFormers / cuML and RASA relies on faiss-gpu, which we also recommend installing on your side.
Otherwise you can use the following commands:
# Install franca with additional dependencies
pip install -e ".[franca,torch,cuml,xformers]"
# Install rasa with additional dependencies
pip install -e ".[torch,faiss]"
To load a Franca model directly using the checkpoint from [link](#coming soon), use the example below:
import torch
from PIL import Image
from torchvision import transforms
from franca.hub.backbones import _make_franca_model
from rasa.src.rasa_head import RASAHead
# --- Step 1: Choose model config ---
arch_name = "vit_large"
img_size = 224
ckpt_path = "<your path>/franca_vitl14_In21K.pth"
rasa_ckpt_path = "<your path>/franca_vitl14_In21K_rasa.pth"
# Define image transformation
transform = transforms.Compose([
transforms.Resize(256, interpolation=transforms.InterpolationMode.BICUBIC),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
])
# --- Step 2: Build and load model ---
model = _make_franca_model(
arch_name=arch_name,
img_size=img_size,
pretrained=True,
local_state_dict=ckpt_path,
RASA_local_state_dict=rasa_ckpt_path,
use_rasa_head=True
)
# --- Step 3: Forward pass ---
model.cuda()
model.eval()
image = Image.open("assets/dog.jpg")
x = transform(image).unsqueeze(0).cuda()
with torch.no_grad():
feats = model.forward_features(x, use_rasa_head=True)
cls_token = feats["x_norm_clstoken"]
patch_tokens = feats["x_norm_patchtokens"]
patch_tokens_debiased = feats["patch_token_rasa"]
print("CLS token shape:", cls_token.shape)
print("Patch token shape:", patch_tokens.shape)
print("Patch token RASA shape:", patch_tokens_debiased.shape)
- Dataset preparation:
Dataset prep
- Training:
Training Details
- Model card:
Models
- RASA Usage:
RASA README
If you use Franca in your research, please cite:
@article{venkataramanan2025franca,
title={Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning},
author={Venkataramanan, Shashanka and Pariza, Valentinos and Salehi, Mohammadreza and Knobel, Lukas and Gidaris, Spyros and Ramzi, Elias and Bursuc, Andrei and Asano, Yuki M.},
journal={arXiv preprint arXiv:2507.14137},
year={2025}
}
We thank the DINOv2 team for their excellent codebase. We also gratefully acknowledge the authors of OpenOOD, Open-hummingbird, Probe3D, and Feat2GS for their open-sourced codes, which we used for downstream evaluations.