-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add named path bases to cargo (v2) #3529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 3 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
4bd6953
Add named path bases to cargo (v2)
dpaoliello 5ccbe52
Update RFC number
dpaoliello e9f7566
Rewrote motiviation
dpaoliello 4af121b
Address PR feedback
dpaoliello 247ff65
Address PR feedback
dpaoliello b18416d
Apply suggestions from code review
dpaoliello 707870d
Remove support for manifest, rename to path-base
dpaoliello 50a3735
Merge branch 'basepath' of https://github.com/dpaoliello/rfcs into ba…
dpaoliello c04201c
Address PR feedback
dpaoliello 87672d5
Add link to tracking issue
dpaoliello File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,392 @@ | ||
- Feature Name: `path_bases` | ||
- Start Date: 2023-11-13 | ||
- RFC PR: [rust-lang/rfcs#3529](https://github.com/rust-lang/rfcs/pull/3529) | ||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Introduce shared base directories in Cargo configuration files that in | ||
turn enable base-relative `path` dependencies. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
As a project grows in size, it becomes necessary to split it into smaller | ||
sub-projects, architected into layers with well-defined boundaries. | ||
|
||
One way to enforce these boundaries is to use different Git repos (aka | ||
"multi-repo"). Cargo has good support for multi-repo projects using either `git` | ||
dependencies, or developers can use private registries if they want to | ||
explicitly publish code or need to preprocess their sub-projects (e.g., | ||
generating code) before they can be consumed. | ||
|
||
If all of the code is kept in a single Git repo (aka "mono-repo"), then these | ||
boundaries must be enforced a different way: either leveraging tooling during | ||
the build to check layering, or requiring that sub-projects explicitly publish | ||
and consume from some intermediate directory. Cargo has poor support for | ||
mono-repos: the only viable mechanism is `path` dependencies, but these require | ||
epage marked this conversation as resolved.
Show resolved
Hide resolved
|
||
relative paths (which makes refactoring and moving sub-projects very difficult) | ||
and don't work at all if the mono-repo requires publishing and consuming from an | ||
intermediate directory (as this may very per host, or per target being built). | ||
|
||
This RFC proposes a mechanism to specify `base` directories in `Config.toml` or | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
`Cargo.toml` files which can be used to prepend `path` dependencies. This allows | ||
mono-repos to specify dependencies relative to their root directory, which | ||
allows the consuming project to be moved freely (no relative paths to update) | ||
and a simple find-and-replace to handle a producing project being moved. | ||
Additionally, a host-specific or target-specific intermediate directory may be | ||
specified as a `base`, allowing code to be consumed from there using `path` | ||
dependencies. | ||
|
||
### Example | ||
|
||
If we had a sub-project that depends on three others: | ||
|
||
* `foo` which is in a different layer of the mono-repo. | ||
* `bar_with_generated` that must be consumed from an intermediate directory | ||
because it contains target-specific generated code. | ||
* `baz` which is in the current layer. | ||
|
||
We may have a `Cargo.toml` snippet that looks like this: | ||
|
||
```toml | ||
[dependencies] | ||
foo = { path = "../../../other_layer/foo" } | ||
bar_with_generated = { path = "../../../../intermediates/x86_64/Debug/third_layer/bar_with_generated" } | ||
baz = { path = "../baz" } | ||
``` | ||
|
||
This has many issues: | ||
|
||
* Moving the current sub-project may require changing all of these relative | ||
paths. | ||
* `bar_with_generated` will only work if we're building x86_64 Debug. | ||
* `bar_with_generated` assumes that the `intermediates` directory is a sibling | ||
to our source directory, and not somewhere else completely (e.g., a different | ||
drive for performance reasons). | ||
* Moving `foo` or `baz` requires searching the code for each possible relative | ||
path (e.g., `../../../other_layer/foo` and `../foo`) and may be error prone if | ||
there is some other sub-project in directory with the same name. | ||
|
||
Instead, if we could specify these `base` directories in a `Config.toml` (which | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
may be generated by an external build system which in turn invokes Cargo): | ||
|
||
```toml | ||
[base-paths] | ||
sources = "/home/user/dev/src" | ||
intermediates = "/home/user/dev/intermediates/x86_64/Debug" | ||
``` | ||
|
||
Then the `Cargo.toml` can use those `base` directories and avoid relative paths: | ||
|
||
```toml | ||
[dependencies] | ||
foo = { path = "other_layer/foo", base = "sources" } | ||
bar_with_generated = { path = "third_layer/bar_with_generated", base = "intermediates" } | ||
baz = { path = "this_layer/baz", base = "sources" } | ||
``` | ||
|
||
Which resolves the issues we previously had: | ||
|
||
* The current project can be moved without modifying the `Cargo.toml` at all. | ||
* `bar_with_generated` works for all targets (assuming the `Config.toml` is | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
generated). | ||
* The `intermediates` directory can be placed anywhere. | ||
* Moving `foo` or `baz` only requires searching for the canonical form relative | ||
to the `base` directory. | ||
|
||
## Other uses | ||
|
||
The ability to use `base` directories for `path` dependencies is convenient for | ||
developers who are using a large number of `path` dependencies within the same | ||
root directory. Instead of repeating the same path fragment many times in their | ||
`Cargo.toml`, they can instead specify it once in a `Config.toml` as a `base` | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
directory, then use that `base` directory in each of their `path` dependencies. | ||
|
||
Cargo can also provide built-in base paths, for example `workspace` to point to | ||
the root directory of the workspace. This allows workspace members to reference | ||
each other without first needing to `../` their way back to the workspace root. | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
If you often use path dependencies that live in a particular location, | ||
or if you want to avoid putting long paths in your `Cargo.toml`, you can | ||
define path _base directories_ in your Cargo [manifest](https://doc.rust-lang.org/cargo/reference/manifest.html) | ||
or [configuration](https://doc.rust-lang.org/cargo/reference/config.html). | ||
Your path dependencies can then be specified relative to those | ||
directories. | ||
|
||
For example, say you have a number of projects checked out in | ||
`/home/user/dev/rust/libraries/`. Rather than use that path in your | ||
`Cargo.toml` files, you can define it as a "base" path in | ||
`~/.cargo/config.toml`: | ||
|
||
```toml | ||
[base-paths] | ||
dev = "/home/user/dev/rust/libraries/" | ||
``` | ||
|
||
Now, you can specify a path dependency on a library `foo` in that | ||
directory in your `Cargo.toml` using | ||
|
||
```toml | ||
[dependencies] | ||
foo = { path = "foo", base = "dev" } | ||
``` | ||
|
||
Like with other path dependencies, keep in mind that both the base _and_ | ||
the path must exist on any other host where you want to use the same | ||
`Cargo.toml` to build your project. | ||
|
||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Reference-level explanation | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
## Specifying Dependencies | ||
|
||
### Base Paths | ||
|
||
A `path` dependency may optionally specify a base path by setting the `base` key | ||
to the name of a base path from the `[base-paths]` table in either the | ||
[manifest](https://doc.rust-lang.org/cargo/reference/manifest.html) or | ||
[configuration](https://doc.rust-lang.org/cargo/reference/config.html#base-paths) | ||
or one of the [built-in base paths](#built-in-base-paths). The value of that | ||
base path is prepended to the `path` value to produce the actual location where | ||
Cargo will look for the dependency. | ||
|
||
For example, if the Cargo.toml contains: | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```toml | ||
[dependencies] | ||
foo = { path = "foo", base = "dev" } | ||
``` | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Given a `[base-paths]` table in the configuration that contains: | ||
|
||
```toml | ||
[base-paths] | ||
dev = "/home/user/dev/rust/libraries/" | ||
``` | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This will produce a `path` dependency `foo` located at | ||
`/home/user/dev/rust/libraries/foo`. | ||
|
||
If the base path is not found in any `[base-paths]` table or one of the built-in | ||
base paths then Cargo will generate an error. | ||
|
||
If the name of a base path is specified in both the manifest and configuration, | ||
then the value in the manifest is preferred. | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The name of a base path must use only [alphanumeric](https://doc.rust-lang.org/std/primitive.char.html#method.is_alphanumeric) | ||
characters or `-` or `_`, and cannot be empty. | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### Built-in base paths | ||
|
||
Cargo provides implicit base paths that can be used without the need to specify | ||
them in a `[base-paths]` table. | ||
|
||
* `workspace` - If a project is [a workspace or workspace member](https://doc.rust-lang.org/cargo/reference/workspaces.html) | ||
then this base path is defined as the path to the directory containing the root | ||
Cargo.toml of the workspace. | ||
|
||
If one of these built-in base paths is also specified in the manifest or | ||
configuration, then that value is preferred over the built-in value. | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## The Manifest Format | ||
|
||
[`[base-paths]`](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#base-paths) - Base paths for path dependencies. | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Configuration | ||
epage marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
`[base-paths]` | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
* Type: string | ||
* Default: see below | ||
* Environment: `CARGO_BASE_PATHS_<name>` | ||
|
||
The `[base-paths]` table defines a set of path prefixes that can be used to | ||
prepend the locations of `path` dependencies. See the [specifying dependencies](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#base-paths) | ||
documentation for more information. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
1. There is now an additional way to specify a dependency in | ||
`Cargo.toml` that may not be accessible when others try to build the | ||
same project. Specifically, it may now be that the other host has a | ||
`path` dependency available at the same relative path to `Cargo.toml` | ||
as the author of the `Cargo.toml` entry, but does not have the `base` | ||
defined (or has it defined as some other value). | ||
|
||
At the same time, this might make path dependencies _more_ re-usable | ||
across hosts, since developers can dictate only which _bases_ need to | ||
exist, rather than which _paths_ need to exist. This would allow | ||
different developers to host their path dependencies in different | ||
locations from the original author. | ||
2. Developers still need to know the path _within_ each path base. We | ||
could instead define path "aliases", though at that point the whole | ||
thing looks more like a special kind of "local path registry". | ||
3. This introduces yet another mechanism for grouping local | ||
dependencies. We already have [local registries, directory | ||
registries](https://doc.rust-lang.org/cargo/reference/source-replacement.html), | ||
and the [`[paths]` | ||
override](https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html#paths-overrides). | ||
However, those are all intended for immutable local copies of | ||
dependencies where versioning is enforced, rather than as mutable | ||
path dependencies. | ||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
This design was primarily chosen for its simplicity — it adds very | ||
little to what we have today both in terms of API surface and mechanism. | ||
But, other approaches exist. | ||
|
||
Developers could have their `path` dependencies point to symlinks in the | ||
current directory, which other developers would then be told to set up | ||
to point to the appropriate place on their system. This approach has two | ||
main drawbacks: they are harder to use on Windows as they [require | ||
special privileges](https://docs.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/create-symbolic-links), | ||
and they pollute the user's project directory. | ||
|
||
For the build-system case, the build system could place vendored | ||
dependencies directly into the source directory at well-known locations, | ||
though this would mean that if the source of those dependencies were to | ||
change, the user would have to re-run the build system (rather than just | ||
run `cargo`) to refresh the vendored dependency. And this approach too | ||
would end up polluting the user's source directory. | ||
|
||
An earlier iteration of the design avoided adding a new field to | ||
dependencies, and instead inlined the base name into the path using | ||
`path = "base::relative/path"`. This has the advantage of not | ||
introducing another special keyword in `Cargo.toml`, but comes at the | ||
cost of making `::` illegal in paths, which was deemed too great. | ||
|
||
Alternatively, we could add support for extrapolating environment | ||
variables (or arbitrary configuration values?) in `Cargo.toml` values. | ||
That way, the path could be given as `path = | ||
"${base.name}/relative/path"`. While that works, it's not trivially | ||
backwards compatible, may be confusing when users try to extrapolate | ||
random other configuration variables in their paths, and _seems_ like a | ||
possible Pandora's box of corner-cases. | ||
|
||
The [`[paths]` | ||
feature](https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html#paths-overrides) | ||
could be updated to lift its current limitations around adding | ||
dependencies and requiring that the dependencies be available on | ||
crates.io. This would allow users to avoid `path` dependencies in more | ||
cases, but makes the replacement more implicit than explicit. That | ||
change is also more likely to break existing users, and to involve | ||
significant refactoring of the existing mechanism. | ||
|
||
We could add another type of local registry that is explicitly declared | ||
in `Cargo.toml`, and from which local dependencies could then be drawn. | ||
Something like: | ||
|
||
```toml | ||
[registry.local] | ||
path = "/path/to/path/registry" | ||
``` | ||
|
||
This would make specifying the dependencies somewhat nicer (`version = | ||
"1", registry = "local"`), and would ensure a standard layout for the | ||
locations of the local dependencies. However, using local dependencies | ||
in this manner would require more set-up to arrange for the right | ||
registry layout, and we would be introducing what is effectively a | ||
mutable registry, which Cargo has avoided thus far. | ||
|
||
Even with such an approach, there are benefits to being able to not put | ||
complex paths into `Cargo.toml` as they may differ on other build hosts. | ||
So, a mechanism for indirecting through a path name may still be | ||
desirable. | ||
|
||
Ultimately, by not having a mechanism to name paths that lives outside | ||
of `Cargo.toml`, we are forcing developers to coordinate their file | ||
system layouts without giving them a mechanism for doing so. Or to work | ||
around the lack of a mechanism by requiring developers to add symlinks | ||
in strategic locations, cluttering their directories. The proposed | ||
mechanism is simple to understand and to use, and still covers a wide | ||
variety of use-cases. | ||
|
||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
Python searches for dependencies by walking `sys.path` in definition | ||
order, which [is pulled | ||
from](https://docs.python.org/3/tutorial/modules.html#the-module-search-path) | ||
the current directory, `PYTHONPATH`, and a list of system-wide library | ||
directories. All imports are thus "relative" to every directory in | ||
`sys.path`. This makes it easy to inject local development dependencies | ||
simply by injecting a path early in `sys.path`. The path dependency is | ||
never made explicit anywhere in Python. We _could_ adopt a similar | ||
approach by declaring an environment variable `CARGO_PATHS`, where every | ||
`path` is considered relative to each path in `CARGO_PATHS` until a path | ||
that exists is found. However, this introduces additional possibilities | ||
for user confusion if, say, `foo` exists in multiple paths in | ||
`CARGO_PATHS` and the first one is picked (though maybe that could be a | ||
warning?). | ||
|
||
NodeJS (with npm) is very similar to Python, except that dependencies | ||
can also be | ||
[specified](https://nodejs.org/api/modules.html#modules_all_together) | ||
using relative paths like Cargo's `path` dependencies. For non-path | ||
dependencies, it searches in [`node_modules/` in every parent | ||
directory](https://nodejs.org/api/modules.html#modules_loading_from_node_modules_folders), | ||
as well as in the [`NODE_PATH` search | ||
path](https://nodejs.org/api/modules.html#modules_loading_from_the_global_folders). | ||
There does not exist a standard mechanism to specify a path dependency | ||
relative to a path named elsewhere. With CommonJS modules, JavaScript | ||
developers are able to extrapolate variables directly into their | ||
`require` arguments, and can thus implement custom schemes for getting | ||
customizable paths. | ||
|
||
Ruby's `Gemfile` [path | ||
dependencies](https://bundler.io/man/gemfile.5.html#PATH) are only ever | ||
absolute paths or paths relative to the `Gemfile`'s location, and so are | ||
similar to Rust's current `path` dependencies. | ||
|
||
The same is the case for Go's `go.mod` [replacement | ||
dependencies](https://golang.org/doc/modules/managing-dependencies#tmp_10), | ||
which only allow absolute or relative paths. | ||
|
||
From this, it's clear that other major languages do not have a feature | ||
quite like this. This is likely because path dependencies are assumed | ||
to be short-lived and local, and thus having them be host-specific is | ||
often good enough. However, as the motivation section of this RFC | ||
outlines, there are still use-cases where a simple name-indirection | ||
could help. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- What should the Cargo configuration table and dependency key be called? This | ||
RFC calls the configuration table `base_path` to be explicit that it is | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
dealing with paths (as `base` would be ambiguous) but calls the key `base` to | ||
keep it concise. | ||
- Is there other reasonable behavior we could fall back to if a `base` | ||
is specified for a dependency, but no base by that name exists in the | ||
current Cargo configuration? This RFC suggests that this should be an | ||
error, but perhaps there is a reasonable thing to try _first_ prior to | ||
yielding an error. | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
It seems reasonable to extend `base` to `git` dependencies, with | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
something like: | ||
|
||
```toml | ||
[base_path] | ||
dpaoliello marked this conversation as resolved.
Show resolved
Hide resolved
|
||
gh = "https://github.com/jonhoo" | ||
``` | ||
|
||
```toml | ||
[dependency] | ||
foo = { git = "foo.git", base = "gh" } | ||
``` | ||
|
||
However, this may get complicated if someone specifies `git`, `path`, | ||
_and_ `base`. | ||
|
||
It may also be useful to be able to use `base` for `patch` and `path`. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.