This repo generates multi-language caption for images. We provide training code and 2 versions of demo:
- DTU version demo: Demo in this version run on DTU.
- GUI version demo: Demo in this version with GUI, and you can run it on GPU or CPU with pt or onnx model. Note that, it can't run on DTU, because existing DTU environment doesn't support GUI.
- Training codes: Codes for training image caption task.
Please make sure that you have install following package in your environment:
pip install --upgrade pytorch torchvision
pip install onnxruntime
pip install transformers
pip install datasets
pip install sacrebleu
pip install sentencepiece
pip install scikit-image
You can run this demo with the following script, and get the result in the last row of log:
python -m inference_DTU_total --language zh --type DTU
Note, you can change parameters as followings:
- language: We support languages in [
en
,zh
,de
,fr
,ro
]. - type: You can change type to
torch
oronnx
, which infer the type of model without DTU.
TODO: Zewei
See details in PyTorch-Tutorial-to-Image-Captioning
folder.