Skip to content

[Jobs] Add huggingface-cli jobs commands #3211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open

[Jobs] Add huggingface-cli jobs commands #3211

wants to merge 32 commits into from

Conversation

lhoestq
Copy link
Member

@lhoestq lhoestq commented Jul 10, 2025

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@lhoestq lhoestq marked this pull request as ready for review July 11, 2025 14:04
@hanouticelina hanouticelina self-assigned this Jul 11, 2025
@lhoestq
Copy link
Member Author

lhoestq commented Jul 11, 2025

This is ready for review for the launch in the coming days ! Would be cool to do a release right after we merge

Btw I integrated your addition @davanstrien from lhoestq/hfjobs#8 and added some useful uv options: --with and --python (we could add more later if needed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine as an experiment, but not a huge fan of the local file uploading to a remote repo..

Is there any way to either:

  • pass the file content as an argument (string) to uv (and thus to the Jobs creation API)
  • ask the infra team to add a new feature to the Jobs creation API where you can a dict of file name to file contents and they are exposed to the docker command? (not sure if it's feasible @christophe-rannou)

Copy link
Member

@davanstrien davanstrien Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass the file content as an argument (string) to uv (and thus to the Jobs creation API)

I don't think this is directly possible in UV at the moment.

ask the infra team to add a new feature to the Jobs creation API where you can a dict of file name to file contents and they are exposed to the docker command? (not sure if it's feasible @christophe-rannou)

Think this would be nice if it was possible. @christophe-rannou, would this be difficult to implement?

Copy link
Member

@davanstrien davanstrien Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the logic of doing using a repo as a backend was to open up options to explore approaches where you could do something like

huggingface-cli jobs uv run --from-repo davanstrien/nice-data-generation-pipeline

I think for that to fully make sense, it would probably also be better to have a "generic" or "code" repo type rather than using a dataset as the storage repo.

Co-authored-by: Julien Chaumond <julien@huggingface.co>
@Wauplin
Copy link
Contributor

Wauplin commented Jul 15, 2025

As I high-level comment, it'd be good to have all the API logic added to HfApi (and therefore callable from a Python script) and the CLI logic (e.g. args, result formatting, etc.) kept in the current ./src/huggingface_hub/commands/jobs folder. @lhoestq Let us know if you have bandwidth to work on this or if you want some help

@lhoestq
Copy link
Member Author

lhoestq commented Jul 15, 2025

As I high-level comment, it'd be good to have all the API logic added to HfApi (and therefore callable from a Python script) and the CLI logic (e.g. args, result formatting, etc.) kept in the current ./src/huggingface_hub/commands/jobs folder. @lhoestq Let us know if you have bandwidth to work on this or if you want some help

I can take care of this for tomorrow

Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice UX, I like it! 🔥 i left some initial comments.
As mentioned by @Wauplin, let's centralize the logic into HfApi:

HfApi.run_job(...)
HfApi.list_jobs(...)
HfApi.inspect_job(...)
HfApi.cancel_job(...)
HfApi.fetch_job_logs(...)

that way, the CLI subcommands become lightweight wrappers and maybe we can put all the sub parsers (run, ps, inspect, logs, cancel, uv) inside one src/huggingface_hub/commands/jobs.py file.

@lhoestq
Copy link
Member Author

lhoestq commented Jul 16, 2025

Just moved everything to HfApi :)

Btw @davanstrien I couldn't make the uv command run a local script because it uploads the script to a private repository that uv run can't access. Should we use public repos for now (with a warning or waiting for user input to accept this maybe) while we have a better solution ? Or we can ignore the local script case and simply have the feature for script URLs for now.

@davanstrien
Copy link
Member

davanstrien commented Jul 17, 2025

Just moved everything to HfApi :)

Btw @davanstrien I couldn't make the uv command run a local script because it uploads the script to a private repository that uv run can't access. Should we use public repos for now (with a warning or waiting for user input to accept this maybe) while we have a better solution ? Or we can ignore the local script case and simply have the feature for script URLs for now.

Oh good point, I had removed the private repo logic for V1. I think just focusing on the URLs could make sense for now, and we could think a bit more about the UX for uploading and managing repos. Also means we can showcase some nice open examples for launch.

Happy to update the PR if you'd like

@julien-c
Copy link
Member

@davanstrien @lhoestq the infra team is working on the "mount this file in the container" API feature btw

For the current prototype feature is there a way to tell uv to pass a bearer token when downloading the remote file? or maybe it's overkill?

I'm also ok with the big red warning that their file is going to be uploaded publicly.

@davanstrien
Copy link
Member

davanstrien commented Jul 17, 2025

@julien-c

For the current prototype feature is there a way to tell uv to pass a bearer token when downloading the remote file? or maybe it's overkill?

It's not currently possible to do this directly in UV, but we hope they will add this as a general feature, allowing uv run URL to work out of the box with private HF URLs if a HF_TOKEN is available

Perhaps for a V1, we go with making the repository public and adding a warning?

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a first pass on the implementation, thanks for having moved everything in HfApi :)

@lhoestq
Copy link
Member Author

lhoestq commented Jul 17, 2025

I fixed huggingface-cli jobs uv run local_script.py :)
It uploads to a private repository, and the job first downloads the file in python with auth before running the uv run command.

before:

uv run https://huggingface.co/.../script.py

after:

bash -c "python -c <download_file> && uv run downloaded_script.py "

(the docker image doesn't have wget/curl so I use python)

I simplified the logic a bit to always upload to the same repository since it's a bit of a pain to remove the repos afterwards.

Taking a look at your comments now @Wauplin , thanks for the review !

Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lhoestq for the iteration! i left some minor comments

@Wauplin
Copy link
Contributor

Wauplin commented Jul 18, 2025

Note that we're working on switching from huggingface-cli ... to hf ... :) (long awaited feature). See #3229. Nothing to do in this PR, I will make sure to adapt in a future PR once both are merged. We will release the hf CLI and the Jobs API at the same time.

lhoestq and others added 3 commits July 21, 2025 18:20
Co-authored-by: célina <hanouticelina@gmail.com>
Co-authored-by: Lucain <lucain@huggingface.co>
@lhoestq
Copy link
Member Author

lhoestq commented Jul 21, 2025

I took your comments into account, let me know if anything is missing :)

I am adding a dedicated docs page showcasing the HfApi features for jobs. But feel free to merge already if you'd like and I can do it in another PR

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 99b538a to remove JobUrl in favor of returning JobInfo in all cases (in which I added self.endpoint). This makes the API more consistent with other hfh methods.

@lhoestq
Copy link
Member Author

lhoestq commented Jul 22, 2025

I took your comments into account and added namespace in the HfApimethods and in the CLI :)

I also added the documentation page

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants