😋 Image captioning

This project is still experimental, and models have not been fine-tuned yet : it should work but it has not been tested yet. Furthermore, due to weights convertion (from pytorch to tensorflow), performances may be slightly degraded*. It is the reason why I will try to fine-tune a bit the models in the next updates.

* The convertion has been tested by comparing outputs from both models. The difference is really small (less than 0.001), but even this small difference can make the generation differ.

Check the CHANGELOG file to have a global overview of the latest modifications ! 😋

Project structure

├── custom_architectures
│   ├── transformers_arch
│   │   └── clip_cap_arch.py    : ClipCap architecture (Mapper + main architecture)
│   └── clip_arch.py        : CLIP architecture (used as image encoder)
├── custom_layers
├── custom_train_objects
├── datasets
├── hparams
├── loggers
├── models
│   ├── image_captioning
│   │   └── clip_cap.py     : ClipCap main class
│   ├── siamese         : the CLIP model is used as image encoder
├── pretrained_models
├── unitest
├── utils
└── image_captioning.ipynb

Check the main project for more information about the unextended modules / structure / main classes.

* Check my Siamese Networks project for more information about the models/siamese module

Available features

You can check the image_captioning notebook for a concrete demonstration

Available models

Model architectures

Available architectures :

ClipCap

Model weights

I do not have fine-tuned any model yet, so you can get pretrained weights in the official project (currently, only the transformer_weights is supported). The CLIP encoders are automatically downloaded when executing the example's notebook (it requires pytorch installed).

Usage and demonstration

Demonstration

You can find some illustration on the official project

Installation and usage

Clone this repository : git clone https://github.com/yui-mhcp/image_captioning.git
Go to the root of this repository : cd image_captioning
Install requirements : pip install -r requirements.txt
Open image_captioning notebook and follow the instruction !

TO-DO list :

Make the TO-DO list
Comment the code
Implement ClipCap Transformers mapper
Implement ClipCap MLP mapper
Add pretrained weights for French
Fine-tune GPT-2 with Transformers mapper

Contacts and licence

You can contact me at yui-mhcp@tutanota.com or on discord at yui#0732

The objective of these projects is to facilitate the development and deployment of useful application using Deep Learning for solving real-world problems and helping people. For this purpose, all the code is under the Affero GPL (AGPL) v3 licence

All my projects are "free software", meaning that you can use, modify, deploy and distribute them on a free basis, in compliance with the Licence. They are not in the public domain and are copyrighted, there exist some conditions on the distribution but their objective is to make sure that everyone is able to use and share any modified version of these projects.

Furthermore, if you want to use any project in a closed-source project, or in a commercial project, you will need to obtain another Licence. Please contact me for more information.

For my protection, it is important to note that all projects are available on an "As Is" basis, without any warranties or conditions of any kind, either explicit or implied. However, do not hesitate to report issues on the repository's project or make a Pull Request to solve it 😄

If you use this project in your work, please add this citation to give it more visibility ! 😋

@misc{yui-mhcp
    author  = {yui},
    title   = {A Deep Learning projects centralization},
    year    = {2021},
    publisher   = {GitHub},
    howpublished    = {\url{https://github.com/yui-mhcp}}
}

Notes and references

Github projects :

ClipCap project : official ClipCap implementation
CLIP project : official CLIP implementation
My SiameseNetwork project : more information about the CLIP architecture and how it works
Papers :
[1] ClipCap: CLIP Prefix for Image Captioning : official ClipCap paper
[2] CLIP blog : official OpenAI's blog post about CLIP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

😋 Image captioning

Project structure

Available features

Available models

Model architectures

Model weights

Usage and demonstration

Demonstration

Installation and usage

TO-DO list :

Contacts and licence

Notes and references

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
custom_architectures		custom_architectures
custom_layers		custom_layers
custom_train_objects		custom_train_objects
datasets		datasets
docker		docker
hparams		hparams
loggers		loggers
models		models
unitests		unitests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
image_captioning.ipynb		image_captioning.ipynb
lena.jpg		lena.jpg
requirements.txt		requirements.txt

License

yui-mhcp/image_captioning

Folders and files

Latest commit

History

Repository files navigation

😋 Image captioning

Project structure

Available features

Available models

Model architectures

Model weights

Usage and demonstration

Demonstration

Installation and usage

TO-DO list :

Contacts and licence

Notes and references

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages