🐳 Add Dockerfile and documentation for using Docker image #134

tonyzamyatin · 2025-03-04T12:57:10Z

I don't have a GPU, so I need to rent a cloud GPU for experiments. Setting up the environment each time is tedious and time-consuming, especially when you terminate the cloud VM after use to save costs. This is why I created this Dockerfile and a pre-built image to streamline the process. 🚀

This PR adds:

A Dockerfile for running experiments in a consistent environment
Documentation on how to use the pre-built Docker image

The repo authors should feel free to push the image to their organizational registry and update the docs on the pre-built image accordingly to pull from their own registry.

…t time

o-laurent · 2025-03-04T16:09:24Z

Thank you, @tonyzamyatin, for your PR! Having a Dockerfile would indeed be useful.

Here are a few comments that @alafage and I had:

Could we perhaps simplify it by installing the latest version of the package via pip instead of doing it from source? As soon as we finish the new version of the documentation, we will be able to push on main more often.
If you think that it should be done from source, we should avoid installing the [all] option, which includes the packages for the documentation and the development.
Could we put the docker file and the corresponding markdown file in a folder named docker?
We should also add information on using the dockerfile in the sphinx documentation. We'll have to finish it in the following days.

codecov · 2025-03-04T17:43:14Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.16%. Comparing base (7d0d345) to head (eda886b).
Report is 15 commits behind head on dev.

Additional details and impacted files

@@           Coverage Diff           @@
##              dev     #134   +/-   ##
=======================================
  Coverage   99.15%   99.16%           
=======================================
  Files         143      143           
  Lines        7136     7149   +13     
  Branches      910      912    +2     
=======================================
+ Hits         7076     7089   +13     
  Misses         27       27           
  Partials       33       33

Flag	Coverage Δ
cpu	`99.16% <ø> (+<0.01%)`	⬆️
pytest	`99.16% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tonyzamyatin · 2025-03-04T21:43:20Z

Thank you for the feedback!

I originally structured the Dockerfile with a developer workflow in mind—where the package is installed from source to allow real-time modifications. This setup is useful for contributors working on a fork, as they can experiment and commit changes while ensuring updates apply immediately (I actually ran into this issue myself when I forgot to install in editable mode). In this case, installing [all] makes sense for a fully equipped development environment.

That said, I completely agree that a lightweight Docker image for end users would be beneficial. Installing from PyPI would streamline the build and ensure stability.

Why not do both? Two Dockerfiles—one for development (from source) and one for end users (from PyPI).

I also agree that moving everything Docker-related into a docker folder makes sense. Let me know your thoughts!

tonyzamyatin · 2025-03-05T00:30:14Z

I tried also moving the Dockerfile to the docker/ folder but then the build fails because the depency file (pyproject.toml) is outside of the build context.

I have tested building the Dockerfile, pulling the Docker image from the registry, and spinning up the container through my cloud provider. I believe I cought all bugs now 🤠

o-laurent · 2025-03-07T09:40:15Z

Sorry, we had some issues with the dev branch. Reopening.

o-laurent

Thank you so much for the PR. I understand the need for a docker image for development purposes, and your scripts seem pretty neat. For now, I don't feel that we would need a production-ready dockerfile, but it may change if we get an issue on this point or if you think that it would be helpful for you.

I've got two small remarks:

We will have to add the contents of docker.md to the new doc
We might create a docker account to store the updated images that we could generate automatically with GitHub actions, I guess

You may want to switch to torchuncertainty[all] again if you think it makes it easier! I'll merge this tonight in dev.

tonyzamyatin · 2025-03-07T11:28:10Z

I strongly recommend providing a ready-to-pull Docker image for developers and contributors, as it would significantly lower the entry barrier and accelerate adoption of the library. The library itself is already very well-structured, with neat CLI interfaces that make it easy to use and extend. Personally, it took me 2–3 days to set up a VM workflow for running experiments on remote accelerators, and I believe this Docker setup would save future contributors a lot of time. Given the effort I put into designing it, I'd love to see it leveraged to help others contribute more efficiently.

btw, do you have any specific ideas or preferences for how the Docker workflow should be structured moving forward? I’d be happy to iterate on it if there's anything you'd like to see differently!

o-laurent · 2025-03-07T12:12:27Z

Hi @tonyzamyatin,

Sorry I wasn't clear, I of course see the interest of your dockerfile, which I will happily merge to the library, and I thank you again for your contribution. I was writing about a potential second dockerfile made only for production purposes, which would install TU directly from PyPI. We could discuss that in a second PR if you feel that it would be beneficial?

In the end, do you want to rollback to installing all the dependencies instead of the necessary & dev ones only?

tonyzamyatin · 2025-03-07T12:47:14Z

No worries, just a little misunderstanding on my side. Thanks for clarifying ^^

Do you think you need a dedicated Dockerfile for production if you already have TU on PyPI?
I'll have another look at what TU installation is best for contributor/developer purposes. For one, I noticed that tensorboard is not included in any installation.

o-laurent · 2025-03-07T13:03:50Z

1. Do you think you need a dedicated Dockerfile for production if you already have TU on PyPI?

This is the point: I'm not sure! I never use docker, and I'm completely unable to determine how useful it could be for potential users, given that TU is still mainly aimed at preparatory works.

2. I'll have another look at what TU installation is best for contributor/developer purposes. For one, I noticed that tensorboard is not included in any installation.

For now, tensorboard is only included in the [all] group, indeed. I'm completely open to restructuring the different optional dependency groups. Until now, we were adding other dependencies manually when we needed them.

It is clear that tensorboard will be more useful than glest, for instance. The [dev] category does not make much sense either in that it includes huggingface-hub and safetensors which are here to help on the experiment side of things and not for testing and linting. One solution would be to add an [experiment] group with tensorboard, huggingface-hub, and safetensors, but this is a bit out of the scope of the current PR. Maybe @alafage, do you have an opinion on this?

tonyzamyatin · 2025-03-07T13:20:46Z

Hi @o-laurent,

After thinking about it more, I see the PyPI package as primarily for use in other projects, while the Dockerfile is for those cloning/forking the repo, which are likely users looking to experiment and contribute. The primary use case of the Docker image is to make it seamless to get the repo onto a VM and run experiments out-of-the-box without setup.

We could create separate Docker images for running experiments vs. editing torch_uncertainty, but those running experiments from the repo are likely implementing their own ideas anyway. This makes a second image unnecessary. Keeping a single Dockerfile for both use cases would reduce confusion and encourage contributions.

What do you think?

o-laurent · 2025-03-07T13:27:55Z

What do you think?

I agree; we can merge as is, maybe with [all] instead of [dev].

tonyzamyatin · 2025-03-07T13:31:29Z

Just to have it in this PR as well:
In the Docker container at least, the pre-commit hook doesn't automatically run, only the pre-commit checks.🧐

o-laurent · 2025-03-07T13:45:47Z

As long as one cannot push failing code it's fine by me. Then, if we find a way to automatically re-commit after fixing, why not? let's keep this for later. I let you push the installation of the pre-commits and then we can merge.

tonyzamyatin · 2025-03-07T16:16:33Z

I let you push the installation of the pre-commits and then we can merge.

They are already being installed as a part of [dev]:

dev = [
    "scikit-learn",
    "huggingface-hub",
    "torch_uncertainty[image]",
    "ruff==0.7.4",
    "pytest-cov",
    "pre-commit",
    "pre-commit-hooks",
]

o-laurent · 2025-03-07T16:28:57Z

I let you push the installation of the pre-commits and then we can merge.

They are already being installed as a part of [dev]:

...

This is the installation of the package, I mean the actual setup of the hooks with

pre-commit install

tonyzamyatin · 2025-03-07T16:43:39Z

Before merging I would like to hear your comment on this related issue @o-laurent. Thanks! 🤠

tonyzamyatin · 2025-03-07T16:49:13Z

I don't get why the style check fails lol

The pre-commmit hook didn't flag anything:

(torch-uncertainty) torch-uncertainty$ git commit -m ":whale: Install pre-commit hooks on container start"
end-of-file-fixer........................................................Passed
check-toml...........................................(no files to check)Skipped
check-yaml...........................................(no files to check)Skipped
check-merge-conflict.....................................................Passed
check-added-large-files..................................................Passed
ruff-check...........................................(no files to check)Skipped
ruff-format..........................................(no files to check)Skipped
[add-dockerfile 2242805] :whale: Install pre-commit hooks on container start
 2 files changed, 4 insertions(+), 1 deletion(-)

Neither did ruff:

(torch-uncertainty) torch-uncertainty$ python3 -m ruff format torch_uncertainty --check
192 files already formatted
(torch-uncertainty) torch-uncertainty$

o-laurent · 2025-03-07T16:49:57Z

This is due to @alafage not updating his ruff 😄 fixing is in progress

o-laurent · 2025-03-07T16:54:05Z

Well, I can't test with the new code version without pushing, but we can proceed and merge! (and hope!)

Thanks for your help, @tonyzamyatin!

tonyzamyatin · 2025-03-07T16:58:26Z

From the related issue:

This means that we can proceed with merging #134

Wait, I will make it part of the Dockerfile installation then, until motion blur is implemented in this repo!

o-laurent · 2025-03-07T16:59:38Z

No, I don't think we need to bother with this; we'll implement motion blur quickly; it doesn't seem very complicated.

tonyzamyatin added 10 commits March 4, 2025 09:50

🐳 Add Dockerfile

a6d0e97

🐳 Fix flit build errors

8b416c0

🐳 Install OpenSSH and start server when container starts

3ef9132

🐳 Fix SSH key not being recognized and add shell prompt customization

4ffeb34

🐳 Reformat Dockerfile

ed04e00

🐳 Install OpenGL and run SSH setup for GitHub-Auth

5d731cd

🐳 Ensure setup only runs once when container is started for ther firs…

025f84d

…t time

🐳 Optimize setup steps and make custom prompt optional via env variable

4106501

🐳 Optimize build time and run shell prompt in initial setup

c8cd50b

📚 Add documentation for using Docker image

a49bd3e

alafage added the enhancement New feature or request label Mar 4, 2025

o-laurent changed the base branch from main to dev March 4, 2025 15:57

🐳 Install dependencies in editable mode

bb79e0b

o-laurent deleted the branch ENSTA-U2IS-AI:dev March 4, 2025 17:53

o-laurent closed this Mar 4, 2025

o-laurent reopened this Mar 4, 2025

🐛 Fix pip install in editable mode

eaa09d6

🔨 Fix and format container start script and move it to entrypoint.sh

ae0405a

alafage deleted the branch ENSTA-U2IS-AI:dev March 6, 2025 22:59

alafage closed this Mar 6, 2025

o-laurent reopened this Mar 7, 2025

o-laurent self-requested a review March 7, 2025 09:46

👕 Move Docker in the installation section

748f698

o-laurent approved these changes Mar 7, 2025

View reviewed changes

o-laurent mentioned this pull request Mar 7, 2025

✨ LeNet BatchEnsemble and Deep Ensemble + minor bugfixes #137

Merged

🐳 Update Dockerfile

eda886b

tonyzamyatin mentioned this pull request Mar 7, 2025

🐛 Missing ImageMagick Dependency for torch_uncertainty.transforms.corruption #131

Closed

🐳 Install pre-commit hooks on container start

2242805

o-laurent merged commit 0f2dca0 into ENSTA-U2IS-AI:dev Mar 7, 2025
0 of 2 checks passed

o-laurent mentioned this pull request Mar 7, 2025

🔥 Remove the Wand dependency #138

Closed

🐳 Add Dockerfile and documentation for using Docker image #134

🐳 Add Dockerfile and documentation for using Docker image #134

Uh oh!

Conversation

tonyzamyatin commented Mar 4, 2025

Uh oh!

o-laurent commented Mar 4, 2025

Uh oh!

codecov bot commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tonyzamyatin commented Mar 4, 2025

Uh oh!

tonyzamyatin commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

o-laurent commented Mar 7, 2025

Uh oh!

o-laurent left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tonyzamyatin commented Mar 7, 2025

Uh oh!

o-laurent commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tonyzamyatin commented Mar 7, 2025

Uh oh!

o-laurent commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tonyzamyatin commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

o-laurent commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tonyzamyatin commented Mar 7, 2025

Uh oh!

o-laurent commented Mar 7, 2025

Uh oh!

tonyzamyatin commented Mar 7, 2025

Uh oh!

o-laurent commented Mar 7, 2025

Uh oh!

tonyzamyatin commented Mar 7, 2025

Uh oh!

tonyzamyatin commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

o-laurent commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

o-laurent commented Mar 7, 2025

Uh oh!

Uh oh!

tonyzamyatin commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

o-laurent commented Mar 7, 2025

Uh oh!

Uh oh!

codecov bot commented Mar 4, 2025 •

edited

Loading

tonyzamyatin commented Mar 5, 2025 •

edited

Loading

o-laurent left a comment •

edited

Loading

o-laurent commented Mar 7, 2025 •

edited

Loading

o-laurent commented Mar 7, 2025 •

edited

Loading

tonyzamyatin commented Mar 7, 2025 •

edited

Loading

o-laurent commented Mar 7, 2025 •

edited

Loading

tonyzamyatin commented Mar 7, 2025 •

edited

Loading

o-laurent commented Mar 7, 2025 •

edited

Loading

tonyzamyatin commented Mar 7, 2025 •

edited

Loading