Vision transformers

The repo contains some practise code on vision transformers taken from different sources, but with some customizations to enhance understanding. It will primarily contain Google colab codes and so it can used by anyone who wants to quickly fine-tune a pre-trained ViT model.

For more information on Vision transformers, please refer to the original paper: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. The repo currently has only one .ipynb notebook. It contains code to use a custom Patch embedding layer with a pre-trained ViT on a CIFAR10 dataset.

Feel free to ask me questions or point out errors.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Pretrained_ViT__using_custom_patch_embedding.ipynb		Pretrained_ViT__using_custom_patch_embedding.ipynb
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vision transformers

About

Uh oh!

Releases

Packages

Languages

ruchikmishra/Vision_transformers

Folders and files

Latest commit

History

Repository files navigation

Vision transformers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages