Removing ao from pyproject.toml #1452

ebsmothers · 2024-08-29T18:17:28Z

TLDR: We have to choose between our ability to consistently provide stable, well-tested nightly packages and a clean install experience for all of our users. This PR reluctantly proposes to sacrifice slightly on the latter so that we can support the former in any capacity.

The goal

torchtune should be usable on either the latest stable release of PyTorch or the PyTorch nightlies. There are three main ways that we support people using torchtune:

our stable releases. A given stable release of torchtune should be runnable on a particular stable release of PyTorch, as well as up to a particular nightly release (i.e. the PyTorch nightly whenever we released our package). Things should still work on the same torchtune stable release for more recent PyTorch versions, but we use some APIs without BC guarantees, so in practice this is definitely not guaranteed.
our nightly releases. These should also be runnable on the latest PyTorch stable release, but in practice the point of installing our nightlies is to use integrations with latest PyTorch features which may only be available on PyTorch nightlies. So here we should recommend people to also install with PyTorch nightlies.
installed from source. When running on main, we should be able to install either PyTorch stable release or PyTorch nightly and have torchtune run out of the box at all times.

This sounds simple, so what's the problem?

The problem

We don't only depend on PyTorch. We also depend on torchvision (less of a concern for this particular discussion) and torchao.

Why is this a problem?

torchao also does nightly releases and stable releases. Also, unlike torchvision, torchao's stable releases are not in sync with PyTorch stable releases.

torchao's goals are similar to ours: their stable releases will work with a particular PyTorch stable release, and their nightly releases will work with the corresponding PyTorch nightlies. But outside of that, there is no guaranteed support for either:

a) torchao nightly release using a previous PyTorch stable release, or
b) torchao stable release using a more recent PyTorch nightly release.

It may not immediately seem like it, but (b) is actually pretty important here: it's the reason that our nightly builds have been broken for 3 weeks, for one.

What can we do?

Option 1: --force-reinstall for nightly support

We could just say "most of our users will be happy on stable ao". That's a pretty reasonable statement and would lend itself to keeping ao in our pyproject.toml file. The problem is that forces them to be on stable PyTorch as well (or risk breakages; see our currently-broken nightly builds). We can consider reminding users who want to be on nightly PyTorch that they will also need to --force-reinstall ao after they install torchtune (since remember, torchao stable will be installed as part of our package in this world), but imo that is extremely confusing and unintuitive.

Option 2: this PR

Remove torchao from our pyproject.toml file. This is in line with what we do with torch and torchvision already (which we do for the same reason) -- we will require users to first manually install those libraries either via the stable channel or the nightly channel.

The main drawback here is that while pip install torch torchvision is relatively idiomatic as a prereq (it's a subset of the command right on the main PyTorch install page), pip install torch torchvision torchao is not. A lot of people will likely forget to do this and things will break.

However, as of today this is the best way for us to support both nightly and stable PyTorch installs properly. The behavior in option 1 is unintuitive and has caused numerous breakages in our CI over the past few weeks that even people on our team did not know how to fix.

Option 3: ???

We have bandied about ideas of "dynamically populating index URLs in pyproject.toml" to delineate between stable and nightly installs. I am no packaging expert but I have not found a feasible way to do this yet. We could also consider moving off a pure-pyproject.toml approach for our packaging. But given the implications for uploading wheels discussed by @joecummings in #656 this also does not seem like a great option.

Like I said, I am no expert, so maybe there is a better way here. But in lieu of that, this PR proposes going with Option 2.

Implementation and testing

By going with option 2, we need to raise errors if users do not have ao installed. Rather than check in a bunch of different places, I've opted to do this more globally: raising errors in torchtune/__init__.py and tests/__init__.py if ao import is unsuccessful. Open to a discussion on whether this is too blunt of an approach, but the alternative is to add try/excepts to every import (and there are quite a lot).

Tested by creating a fresh env with (a) PyTorch nightlies and (b) PyTorch stable without ao to confirm I get the expected ImportError. After installing ao, things work as expected.

pytorch-bot · 2024-08-29T18:17:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1452

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d4bd32d with merge base 71be8ad ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

NicolasHug · 2024-09-02T16:23:42Z

Just sharing my 2 cents after chatting a bit with Evan offline. From what I understand the UX of the different options are:

To install stable:

# Option 1
pip install torch torchvision torchtune
# Option 2
pip install torch torchvision torchao torchtune

To install nightly:

# Option 1
pip install torch torchvision torchtune --pre ...  # pulls in torchao stable, bad
pip install torchao --pre ... --force-reinstall

# Option 2
pip install torch torchvision torchao torchtune --pre ...

This seems like a clear win for Option 2 at first sight, but in practice I suspect installing the stable torchtune will look more like an isolated pip install torchtune (where torch and torchvision have already been installed previously), and such a scenario would break with Option 2 because torchao would be missing.

Overall I think Option 1 provides a better experience when installing the stable packages, and a slightly worse experience for installing the nightlies. So I guess the choice depends on which one of these experiences you want to prioritize?

joecummings · 2024-09-03T13:33:02Z

From @NicolasHug:

Overall I think Option 1 provides a better experience when installing the stable packages, and a slightly worse experience for installing the nightlies.

This is a great TL;DR for the whole issue. To add my thoughts, I think this is dynamic. ATM, we implicitly prefer nightly builds b/c we rarely do stable releases and incorporate a lot of new features from AO and Core. Plus, the "LLM community" moves fast and expects us to, as well.

However, I see a future in which we have more frequent releases where we would prefer the better UX for stable.

ebsmothers · 2024-09-03T15:43:32Z

Also @msaroufim just FYI

joecummings

One comment, but I won't block

joecummings · 2024-09-03T15:44:17Z

tests/__init__.py

+    torchao not installed.
+    If you are using stable PyTorch please install via pip install torchao.
+    If you are using nightly PyTorch please install via
+    pip install --pre torchao --index-url https://download.pytorch.org/whl/nightly/cu121."


Maybe just point to the install page so we only have to maintain one source of truth for how to install torchao.

Sorry - even more annoying. Can you link specifically to the section in install about download torchao, torch, etc.?

You may need to add a tag there in RST.

[wip] removing ao from pyproject.toml

7b5295e

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 29, 2024

Merge branch 'main' into drop-ao

67bb5e2

ebsmothers requested review from joecummings, kartikayk and pbontrager September 2, 2024 04:24

check ao import

a322c85

ebsmothers marked this pull request as ready for review September 2, 2024 04:26

ebsmothers changed the title ~~[wip] removing ao from pyproject.toml~~ Removing ao from pyproject.toml Sep 2, 2024

joecummings approved these changes Sep 3, 2024

View reviewed changes

ebsmothers added 4 commits September 3, 2024 08:54

Merge branch 'main' into drop-ao

808eeb6

update import error message

0e1a75b

Point directly to pre-requisites link in docs

83f01b3

formatting

d4bd32d

ebsmothers merged commit ca0a6de into meta-pytorch:main Sep 3, 2024
20 checks passed

ebsmothers deleted the drop-ao branch September 3, 2024 18:45

vijaygkd mentioned this pull request Sep 13, 2024

Installation of 0.2.1 not compatible on mac as the dependency torchao==0.3.1 can't be fulfilled #1572

Closed

ebsmothers mentioned this pull request Feb 9, 2025

fix: torch and torchvision import check #2368

Closed

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Removing ao from pyproject.toml #1452

Removing ao from pyproject.toml #1452

Uh oh!

ebsmothers commented Aug 29, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 29, 2024 •

edited

Loading

Uh oh!

NicolasHug commented Sep 2, 2024

Uh oh!

joecummings commented Sep 3, 2024

Uh oh!

ebsmothers commented Sep 3, 2024

Uh oh!

joecummings left a comment

Uh oh!

joecummings Sep 3, 2024

Uh oh!

joecummings Sep 3, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Removing ao from pyproject.toml #1452

Removing ao from pyproject.toml #1452

Uh oh!

Conversation

ebsmothers commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The goal

The problem

Why is this a problem?

What can we do?

Option 1: --force-reinstall for nightly support

Option 2: this PR

Option 3: ???

Implementation and testing

Uh oh!

pytorch-bot bot commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1452

✅ No Failures

Uh oh!

NicolasHug commented Sep 2, 2024

Uh oh!

joecummings commented Sep 3, 2024

Uh oh!

ebsmothers commented Sep 3, 2024

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

joecummings Sep 3, 2024

Choose a reason for hiding this comment

Uh oh!

joecummings Sep 3, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ebsmothers commented Aug 29, 2024 •

edited

Loading

pytorch-bot bot commented Aug 29, 2024 •

edited

Loading