Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

Shashanka Venkataramanan*, Valentinos Pariza*, Mohammadreza Salehi, Lukas Knobel, Spyros Gidaris, Elias Ramzi, Andrei Bursuc†, Yuki M. Asano†

*: equal contribution †: equal advising

Valeo.ai, Paris; Fundamental AI Lab, UTN; VIS Lab, University of Amsterdam

Welcome to the official codebase for Franca (pronounced Fran-ka), the first fully open-source vision foundation model—including data, code, and pretrained weights.

Franca matches or surpasses the performance of leading proprietary models such as DINOv2, CLIP, and SigLIPv2. Built on a fully transparent training pipeline leveraging publicly available datasets like ImageNet-21K and LAION-600M, Franca advances the state of self-supervised learning (SSL) in vision.

Key contributions include:

Nested Matryoshka Clustering: A parameter-efficient, multi-head clustering projector that refines feature representations into increasingly fine-grained clusters without increasing model size. This approach improves performance while reducing memory overhead for downstream applications.
RASA (Relative Absolute Spatial Attention): A novel positional disentanglement strategy that explicitly removes positional biases from dense representations, enhancing semantic encoding.
CyclicMask: A simple masking strategy with circular shifts that overcomes spatial imbalance in masking augmentations.

Despite training on large-scale, uncurated open-source internet data, Franca demonstrates strong generalization across scales and excels on a wide range of downstream tasks, including: In-context learning with HummingBird benchmark, Out-of-distribution detection, 3D understanding, various image classification benchmarks.

Franca is released as a research project to promote transparency, reproducibility, and broad accessibility in vision foundation models. We aim to establish a new benchmark for open and generalizable AI models that empower the scientific community.

News

[July 2025] Official code and pretrained models released! 🔥

Pretrained models

model	# of params	Dataset	Resolution	ImageNet k-NN	ImageNet linear	HummingBird VOC	Linear Segm. ADE20K	download	download
ViT-B/14	86 M	In21K	518	77.4%	82.0%	75.7%	39.1%	backbone only	RASA head
ViT-L/14	300 M	In21K	224	82.2%	84.5%	73.5%	41.3%	backbone only	RASA head
ViT-L/14	300 M	LAION-600M	224	81.9%	84.4%	73.5%	41.4%	backbone only	RASA head
ViT-g/14	1,100 M	In21k	224	83.0%	85.9%	71.7%	40.2%	part 1, part 2, part 3	RASA head
ViT-g/14	1,100 M	LAION-600M	224	81.2%	85.0%	76.7%	42.4%	part 1, part 2, part 3	RASA head

Pretrained backbones (via PyTorch Hub)

Please follow the instructions here to install PyTorch (the only required dependency for loading the model). Installing PyTorch with CUDA support is strongly recommended.

import torch

# Franca -- In21k
franca_vitb14 = torch.hub.load('valeoai/Franca', 'franca_vitb14', use_rasa_head=True)
franca_vitl14 = torch.hub.load('valeoai/Franca', 'franca_vitl14', use_rasa_head=True)
franca_vitg14 = torch.hub.load('valeoai/Franca', 'franca_vitg14', use_rasa_head=True)

# Franca -- Laion600M
franca_vitl14 = torch.hub.load('valeoai/Franca', 'franca_vitl14', weights='LAION', use_rasa_head=True)
franca_vitg14 = torch.hub.load('valeoai/Franca', 'franca_vitg14', weights='LAION', use_rasa_head=True)

# Dinov2 baseline -- In21k
franca_vitb14 = torch.hub.load('valeoai/Franca', 'franca_vitb14', weights='DINOV2_IN21K', use_rasa_head=False)
franca_vitl14 = torch.hub.load('valeoai/Franca', 'franca_vitl14', weights='DINOV2_IN21K', use_rasa_head=False)

Installation

git clone https://github.com/valeoai/Franca.git
cd Franca
# To install Franca and RASA you can use the following command
pip install -e ".[franca]"
# To install RASA seperately you can use the following command
pip install -e .

We recommend install torch separately to match your specific config. Similarly, Franca relies on xFormers / cuML and RASA relies on faiss-gpu, which we also recommend installing on your side.

Otherwise you can use the following commands:

# Install franca with additional dependencies
pip install -e ".[franca,torch,cuml,xformers]"
# Install rasa with additional dependencies
pip install -e ".[torch,faiss]"

Inference code

To load a Franca model directly using the checkpoint from [link](#coming soon), use the example below:

import torch
from PIL import Image
from torchvision import transforms
from franca.hub.backbones import _make_franca_model
from rasa.src.rasa_head import RASAHead

# --- Step 1: Choose model config ---
arch_name = "vit_large"
img_size = 224
ckpt_path = "<your path>/franca_vitl14_In21K.pth"
rasa_ckpt_path = "<your path>/franca_vitl14_In21K_rasa.pth"

# Define image transformation
transform = transforms.Compose([
    transforms.Resize(256, interpolation=transforms.InterpolationMode.BICUBIC),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
])

# --- Step 2: Build and load model ---
model = _make_franca_model(
    arch_name=arch_name,
    img_size=img_size,
    pretrained=True,
    local_state_dict=ckpt_path,
    RASA_local_state_dict=rasa_ckpt_path,
    use_rasa_head=True
)


# --- Step 3: Forward pass ---
model.cuda()
model.eval()

image = Image.open("assets/dog.jpg")
x = transform(image).unsqueeze(0).cuda()


with torch.no_grad():
    feats = model.forward_features(x, use_rasa_head=True)
    cls_token = feats["x_norm_clstoken"]
    patch_tokens = feats["x_norm_patchtokens"]
    patch_tokens_debiased = feats["patch_token_rasa"]

print("CLS token shape:", cls_token.shape)
print("Patch token shape:", patch_tokens.shape)
print("Patch token RASA shape:", patch_tokens_debiased.shape)

Structure

Dataset preparation: Dataset prep
Training: Training Details
Model card: Models
RASA Usage: RASA README

Citation

If you use Franca in your research, please cite:

@article{venkataramanan2025franca,
  title={Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning},
  author={Venkataramanan, Shashanka and Pariza, Valentinos and Salehi, Mohammadreza and Knobel, Lukas and Gidaris, Spyros and Ramzi, Elias and Bursuc, Andrei and Asano, Yuki M.},
  journal={arXiv preprint arXiv:2507.14137},
  year={2025}
}

Acknowledgments

We thank the DINOv2 team for their excellent codebase. We also gratefully acknowledge the authors of OpenOOD, Open-hummingbird, Probe3D, and Feat2GS for their open-sourced codes, which we used for downstream evaluations.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
franca		franca
logfiles/In21K		logfiles/In21K
rasa		rasa
scripts		scripts
LICENSE		LICENSE
LICENSE_DATASET		LICENSE_DATASET
LICENSE_MODEL		LICENSE_MODEL
README.md		README.md
dataset_prep.md		dataset_prep.md
downstream.md		downstream.md
hubconf.py		hubconf.py
lint.sh		lint.sh
model_card.md		model_card.md
pyproject.toml		pyproject.toml
training.md		training.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

News

Pretrained models

Pretrained backbones (via PyTorch Hub)

Installation

Inference code

Structure

Citation

Acknowledgments

About

Uh oh!

Releases 1

Packages

Contributors 2

Languages

License

valeoai/Franca

Folders and files

Latest commit

History

Repository files navigation

Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

News

Pretrained models

Pretrained backbones (via PyTorch Hub)

Installation

Inference code

Structure

Citation

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages