GitHub - visresearch/Awesome-Data-Efficient-ViT

Awesome Data-Efficient ViT (Vision Transformer)

Data-hungry property of Vision Transformer (ViT) hinders it from widespread application on various data scarcity sceneries.

We collect a paper list for data-efficient training methods as follows.

The contributions of data-efficient ViT are welcomed.

Data-Efficient Multi-Scale Fusion Vision Transformer

[link], [code], [2025]

TLDR: incorporate multi-scale vision tokens to improve data efficiency for the training of vision transformer.

data-efficient training on CIFAR10, CIFAR100, EMNIST, Fashion-MNIST and Caltech101 dataset.
Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning

[arxiv], [code], [2024]

TLDR: construct multi-grained correspondences between positive views for contrastive learning to capture the representations of different granularity semantics.

data-efficient training on COCO, PASCAL VOC and ADE20K dataset.
Inter-Instance Similarity Modeling for Contrastive Learning

[arxiv], [code], [2023]

TLDR: construct inter-instance similarity between different image instance by patch-mix strategy to encourage the model to capture the similarity between natural images.

data-efficient training on CIFAR10 and CIFAR100 dataset.
Asymmetric Patch Sampling for Contrastive Learning

[link], [arxiv], [code], [2025]

TLDR: construct hard positive pairs to encourage more appearance-invariant representations.

data-efficient training on CIFAR10 and CIFAR100 dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md