Skip to content

Proposal: Extension-like Subpackages #58051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

MasonProtter
Copy link
Contributor

@MasonProtter MasonProtter commented Apr 8, 2025

This is a WIP proposal for trying to solve #55516. @KristofferC mentioned another design here that is potentially simpler, but I was curious about exploring this idea and seeing where it goes.

The high level idea here is that [subpackages] are very similar to [extensions] (and hence most of the code is shamelessly stolen and adapted from Kristoffer's extensions PR). They are sub-packages that live inside of a parent package, and are not loaded by default. They differ from extensions in that

  1. Instead of being implicitly loaded when their dependencies are loaded, they can be explicitly using-ed.
  2. Instead of providing 'glue code' and typically performing some sort of piracy to stitch two packages together, the point of a subpackage is to expose extra features from a package that not all users of a package would need, and wouldn't want to pay the loading cost for.
  3. (not yet implemented) Downstream users of a package that has subpackages should be able to explicitly depend on a subpackage from that package, and when they do so, any weakdeps that the subpackage depends on, should be converted into normal deps in the downstream package's Manifest.toml.

A Project.toml with [subpackages] might look something like

name = "SomePackage"
uuid = "0237f425-1828-4bb9-8eeb-a4f4cdb55107"
version = "0.1.0"

[weakdeps]
Example = "7876af07-990d-54b4-ab0e-23690620f79a"

[subpackages]
Subpackage1 = ["SomePackage", "Example"]
Subpackage2 = "SomePackage"
Subpackage3 = ""

This says that SomePackage.Subpackage1 depends on Somepackage.jl, and Example.jl and those must be loadable in order for SomePackage.Subpackage1 to be loaded. Whereas Subpackage3 has no dependancies and can always be loaded.

There should then exist files at SomePackage/sub/Subpackage1.jl, SomePackage/sub/Subpackage2.jl, SomePackage/sub/Subpackage3.jl which define modules with those names, and they're allowed to depend on anything from SomePackage's deps, and whatever weakdeps they declare. They are allowed to export names they define.

The Manifest.toml would look something like

[[deps.SomePackage]]
deps = []
path = "."
uuid = "0237f425-1828-4bb9-8eeb-a4f4cdb55107"
version = "0.1.0"

    [deps.SomePackage.subpackages]
    Subpackage1 = ["SomePackage", "Example"]
    Subpackage2 = "SomePackage"
    Subpackage3 = ""

    [deps.SomePackage.weakdeps]
    Example = "7876af07-990d-54b4-ab0e-23690620f79a"

I added a stopgap macro @subpackage_using SomePackage.Subpackage2 that allows Subpackage2 to be loaded, but this should be replaced with dedicated syntax.

See the tests I added for an example of this in action.


Edits:

  • Renamed subpackages to submodules
  • Made it so subpackages don't need to depend on the parent module

@MasonProtter MasonProtter marked this pull request as draft April 8, 2025 21:21
@giordano
Copy link
Member

giordano commented Apr 8, 2025

Instead of being implicitly loaded when their dependencies are loaded, they can be explicitly using-ed.

Does loading a submodule automatically load the main module? I think that's hinted at by

Whereas Submodule2 has no dependancies (other than an implicit dependancy on SomePackage), and can always be loaded.

but I'm asking for clarification. My understanding is that submodules in python don't load the main package and I quite like it (if my interpretation of the behavior is correct).

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Apr 8, 2025

So yes, my thinking was that it should automatically load the main module.

But now that you mention it, there's no real reason it has to be that way. It desired, these probably could be independently loadable. The submodule could just choose whether or not it loads the main module by using it or not using it.

@nsajko
Copy link
Contributor

nsajko commented Apr 9, 2025

I really dislike the naming here. We already have submodules: any module within another module might be called a submodule in relation to it's parent or some other ancestor. Introducing something else called "submodule" would be confusing.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Apr 9, 2025

Sure, they can be called whatever. I'm not married to any particular name. Suggestions welcome.

The reason I went with this name is that I wanted them to behave as closely as possible to the behaviour of using regular modules inside of a package, except that they're optionally loaded rather than eagerly loaded. But I agree it's not a perfect name.

@KristofferC
Copy link
Member

KristofferC commented Apr 9, 2025

I wrote some stuff about this feature along time ago in https://hackmd.io/jpsiIxdsSRe4ANi55_F7oA (where I called it subpackages). I'll just copy and paste the pertinent questions I had in there with some comments after in parenthesis so they can be discussed here:

  • Should each subpackage have a unique Project.toml file? (According to this PR, no, it is all in the main Project file)
  • Should it contain the exact structure of a normal package without a version field? (According to this, no, it's just a file like an extension?)
  • Can you start Julia with a subpackage as the active environment? (I think according to this PR no, but it would be pretty useful)
  • Can a subpackage depend on another subpackage (whether from the same main package or a different package)? (TODO?)
  • Can different subpackages of the same main package have different compatibility constraints for shared dependencies? (I think this is a no)
  • Can subpackages have their own Artifacts.toml file? (Seems important if e.g. some DiffEq solver depends on a binary artifact)
  • Does using MainPackage::SubPackage require that MainPackage is already loaded, or can SubPackage function independently? (Should probably be independent?)
  • Relatedly, does SubPackage always depend on MainPackage? (No reason for this, or?)
  • How should CI typically be setup for a package containing subpackages. You probably want an environment where you can load the subpackages in it.
  • How do we inform the resolver that it can only select the same version of the main package and subpackages when they are input as individual packages?

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Apr 9, 2025

Ah thanks for sharing that hackmd, I must have missed it in the previous conversations, though I saw people discussing the questions you posed.

So I think most of these questions basically boil down to "how separable should a subpackage be from the parentpackage?" I think if one answers "yes" to most of these questions, then the mechanism I'm looking at in this PR should not be used, and we should instead base this off [workspaces], and just have them be whole packages living inside of another package.

Should each subpackage have a unique Project.toml file? (According to this PR, no, it is all in the main Project file)
Should it contain the exact structure of a normal package without a version field? (According to this, no, it's just a file like an extension?)

Yeah, I think if we want those things, then maybe this feature should be built in a different way.

Can you start Julia with a subpackage as the active environment? (I think according to this PR no, but it would be pretty useful)

Hm, interesting I hadn't really thought of needing to do that. I think we could do this with the design I have here, but it'd probably be more awkward than it having its own Project.toml.

Can a subpackage depend on another subpackage (whether from the same main package or a different package)? (TODO?)

Yes, this should absolutely be supported.

Can different subpackages of the same main package have different compatibility constraints for shared dependencies? (I think this is a no)

Not in this design.

Can subpackages have their own Artifacts.toml file? (Seems important if e.g. some DiffEq solver depends on a binary artifact)

Hm, I hadn't thought of this. I'd ideally like to share an Artifacts.toml and then maybe just have the subpackage declare which artifacts it uses? I guess if we needed to avoid installing all artifacts we'd have to have a concept of a artifact-weak-dependency...

Does using MainPackage::SubPackage require that MainPackage is already loaded, or can SubPackage function independently? (Should probably be independent?)
Relatedly, does SubPackage always depend on MainPackage? (No reason for this, or?)

Yeah, Mose brought this up above. I think this can/should be done, I just didn't think to support it, but it shouldn't be too hard.

How should CI typically be setup for a package containing subpackages. You probably want an environment where you can load the subpackages in it.

I'd say you should just add the subpackage deps as test deps, and then you should be able to just load the subpackages as needed.

How do we inform the resolver that it can only select the same version of the main package and subpackages when they are input as individual packages?

In this proposed design, you don't have to, as they are not input as individual packages.


My motivation for doing the PR this way, where I'm answering 'no' to a lot of the above questions was

  1. I want to think of these things like a module inside of the main package that's lazily loaded, so I want it to share as much of the organizational cruft with the parent package as possible, and I'd ideally like to avoid having a bunch of separate config files for each subpackage.
  2. I was curious to learn about how loading.jl works and this one way to learn a bit about it.

If people feel we really do need the flexibility / clarity that comes with each subpackage having its own Project.toml, and it's own entire directory structure and all that stuff, then this PR probably isn't the right direction, but I wanted to at least try out a 'lighter', more unified structure, and see if it can be made to work.

@MasonProtter
Copy link
Contributor Author

I've switched the PR to call them subpackages instead of submodules, and I've made it so that the parent package does not need to be loaded for the subpackage to be loaded.

I've also edited the original post to reflect that.

@MasonProtter MasonProtter changed the title Proposal: Submodules Proposal: Subpackages Apr 9, 2025
@MasonProtter MasonProtter changed the title Proposal: Subpackages Proposal: Extension-like Subpackages Apr 9, 2025
@vchuravy
Copy link
Member

vchuravy commented Apr 9, 2025

One thing that I have been missing with package extensions is the notion of them being able to have additional dependencies (I kinda understand why that is the case, but I am still sad about it)

There would be a nice symmetry if a package extension is a subpackage that is automatically loaded when a trigger or trigger package is loaded. Maybe that would also solve the issue with folks wanting to export names from package extensions since you could use "using Oceanaigans::Plotting" to load the plot functions, but Oceanaigans::Plotting could also be auto-loaded by using Makie and add data conversion extensions.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Apr 9, 2025

I guess one thing that could be done there is just put all the plotting stuff in the Oceanigans::Plotting subpackage, and then add a stub extension that just does

module OceanigangsPlottingExt

using Oceanigangs::Plotting

end # module

The Project.toml would look something like

name = "Oceanigans"
uuid = ".."

[weakdeps]
Makie = "..."

[extensions]
OceanigansPlottingExt = "Makie"

[subpackages]
Plotting = ["Oceanigans", "Makie"]

@vchuravy
Copy link
Member

vchuravy commented Apr 9, 2025

Yes, but you would somehow need to add an dependency arrow from the extension to the subpackage. And that is currently not possible.

@MasonProtter
Copy link
Contributor Author

Ah I see. Yeah, we'd need to generalize the trigger mechanism a bit for that.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Apr 9, 2025

So if we might want more complex trigger mechanisms for extensions anyways, then this actually brings me back to wondering if the right solution here is to just make a way to list an extension as a dependency, and make extensions using-able rather than create a separate subpackages concept.


Alternatively, you could instead just put the glue methods in the OceanigansPlottingExt extension, and then the plotting API exports in the Oceanigans::Plotting subpackage, and then using Oceanigans::Plotting would automatically trigger the extension.

@Roger-luo
Copy link
Contributor

Roger-luo commented Apr 9, 2025

FYI. I believe at somepoint the discussion regarding to local module loading involved similar idea of this as subpackages. Except that was combined with a syntax convention as well See also #49155

I personally feel relying on configure files heavily is an overkill while the information can be brought into the source code via syntax - it is quite hard for users not familiar with all these mechenism figuring out where that module is coming from, we already suffering from the using-pollution problem in codebase I believe having more in a configure file would make this worse.

@MasonProtter
Copy link
Contributor Author

MasonProtter commented Apr 9, 2025

That discussion seems to be about something else. The core issue here is optional loading. i.e. the module Foo should only be loaded if requested, and it should be separately precompiled in its own pkgimage in parallel.

We need these subpackages to have their own sets of dependencies that can be wider than the dependencies of the parent package, so that's why the config files need to be a part of it.


Put another way: the goal here isn't to replace include, the goal is for e.g. DifferentialEquations.jl to become just one package, rather than hundreds of packages.

This might end up being why this PR's direction gets rejected. It tries it's best to minimize the amount of mucking about in config files and extra nested directories, but we might end up needing even more of them, and have each subpackage be a full on environment with its own Project.toml

@Roger-luo
Copy link
Contributor

Roger-luo commented Apr 9, 2025

Put another way: the goal here isn't to replace include, the goal is for e.g. DifferentialEquations.jl to become just one package, rather than hundreds of packages.

This is exactly why I proposed it. I am not proposing to replace include. Lacking automated lazy+local loading has been a nightmare in Bloqade which is pretty much the whole story why we have to abandon the package and adopt Python in the pipeline. In interpreter languages this is "just" local module loading when interpreter see the request of loading a module which as a result to be lazy. The reason of having it instead of include is that you can indicate this in the source code without breaking include or creating another option in the config file. IMO, this is much more readable than messing up with config files.

@Roger-luo
Copy link
Contributor

Roger-luo commented Apr 9, 2025

Maybe let me elarboate a bit more on what I was trying to say here. I'm not againsting this RFC. I'm proposing to use the following syntax instead of a confg file to solve the exact same problem. Tbh you already solve most of the issue here, all I'm saying is we can combine it with local module loading - this feature implemented in this PR is pretty much what local module loading means in an interpreted language except the syntax

module Foo # incomplete module declaration triggers the compiler to look for a subpackage or pkgimage

The benefit is you will be able to use using Moo.Foo naturally which shows people this is a submodule semantically (while optionally/lazily loading the package is an implementation detail), e.g

module Moo

module Foo # optionally loads the subpackage Foo when you type `Moo.Foo`

end

the Foo is in the same structure of a standard Julia package with Project.toml (without UUIDs maybe?) and thus a subpackage. The project structure of Foo is fairly independent and thus can be moved around.

And when you change your mind to move this out of the package Moo to be an independent package Foo you don't need to change anything and keep the existing package structure. e.g when Foo is a independent package, and Moo depends on it inside Moo the only change is from module Foo -> using Foo none of the downstream package would be broken, and in reverse when you move a package into a subpackage in this case moving DiffEqPlots into DiffEq.Plots, you just change using DiffEqPlots to module Plots.

While in the case of using a config file, if you change your mind you have to dig into the config file and update the config file and update those strange :: syntax.


I realize the syntax is probably not the best because module Foo would cause ambiguity when parsing. But you get the idea, we could change it to package Foo, extern Foo etc.

@MasonProtter
Copy link
Contributor Author

I guess I'm confused then. If you don't want config files involved, how would your preferred syntax for this handle stuff like optional dependancies? And how would a downstream package depend on some of the subpackages, but not all of them? This is important so that the environment knows what packages to precompile, and what deps to install, rather than unnecessarily precompiling all the subpackages and all the deps.

@Roger-luo
Copy link
Contributor

Roger-luo commented Apr 9, 2025

If you don't want config files involved, how would your preferred syntax for this handle stuff like optional dependancies?

In my opinion there are two things here:

  • a. for compiler, which folder should the compiler compile into individual pkgimage/treat as a subpackage/compile-unit. This is indicated via a new syntax. Because syntax is the what interacts with most programmers, this allows them to know there is a subpackage can be optionally loaded into the program by just reading the source code.
  • b. for package manager, what dependency to download is defined by the config file.

For a. this is address by module Foo and using Moo.Foo syntax. For b. this could use similar config file solution as rust features (or above as you proposed but I dislike the restricted folder structure), or allow Project.toml located inside the subproject of Foo (I would vote for this or rust features-like spec).

you can see similar separation in rust or Python (well not a compiler tho). I will use rust as an example here:

mod <name> is a syntax indicating there is something structured as a subpacakge that the compiler should treat as an independent compile unit. The features option in the config file indicates what optional dependencies needs to be downloaded. Python's extra is pretty much the same.

The reason of separation here is to make compilation less coupled with package loading. So for example, when I have a local packages/scripts, this still works well together without a package manager. For example, this is more generic because using Moo.Foo.Goo would also work if Goo is a subpackage of Foo and Foo is a subpackage of Moo.

And how would a downstream package depend on some of the subpackages, but not all of them?

That said, I'm not against the RFC, I'm just saying you can solve the first problem via a syntax delcaration. Yeah necessary external dependencies needs to be delcared in a config file there is no other options. But for local dependencies you don't have to use config file, but instead you can use a syntax.

@clarkevans
Copy link
Member

clarkevans commented May 15, 2025

We're writing some opinionated extensions to the FunSQL ecosystem, for example, an IDE using Pluto, let's call it "PlutoFunSQL".

  1. I'd like a monorepo organization, to have a single release and single version and make sure everything is tested together.
  2. I'd like a single package that is registered, not N separate ones. I guess equivalently, I'd like "using FunSQL.Pluto" to work rather than "using PlutoFunSQL".
  3. If a user doesn't use the Pluto IDE, they shouldn't have to install Pluto; but, Julia should know that if "FunSQL.Pluto" is used, then "Pluto" is needed.
  4. There could also be "FunSQL.PlutoStatic" that also needs Pluto. That is, two different sub-packages may share a dependency.
  5. If Pluto and FunSQL are used, we shouldn't assume FunSQL.Pluto is wanted.
  6. It isn't clear if I'd need dependencies among subpackages, e.g. FunSQL.Pluto depends upon FunSQL.PlutoStatic.

For now, I think we're going to have a separate mega-package, PlutoFunSQL which is opinionated and has lots of dependencies, such as Makie. The the core FunSQL keeps almost no dependencies. Once something like this is released, we could refactor, deprecate PlutoFunSQL and break this mega package into N subpackages.

Note: we will use the extension feature for database drivers, but this is a separate user experience that seems quite different than this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies design Design of APIs or of the language itself feature Indicates new feature / enhancement requests modules package extensions packages Package management and loading
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants