Skip to content

PoC: Getting Any, Text & numbers closure down to 35 modules #12819

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

JaroslavTulach
Copy link
Member

@JaroslavTulach JaroslavTulach commented Apr 10, 2025

Pull Request Description

from Standard.Base.Prelude import all

main n=10 = [Any, Integer, Float, Text, n*6]
  • previous script contains from Standard.Base.Prelude import all however that could be the default
  • however previously such a code would load 190 modules - e.g. .enso files
  • with the changes provided in this issue we are down to 35 modules
  • however we need a way to keep compatibility (via extension methods or something more?)

Important Notes

Save the above snippet as b.enso and then run:

enso$ ./built-distribution/enso-engine-*/enso-*/bin/enso \
  --no-ir-caches --run b.enso --log-level debug 2>&1 | grep Parsing

or execute tests:

sbt:enso> runtime-integration-tests/testOnly *AnyClosureTest 

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Apr 10, 2025

This is the list of 35 modules that currently gets loaded when executing the simple script provided in the description:

sbt:enso> runEngineDistribution --no-ir-caches --run b.enso
Standard.Base.Prelude
Standard.Base.Warning
Standard.Base.Function
Standard.Base.Nothing
Standard.Base.Data.Boolean
Standard.Base.Any
Standard.Base.Internal.Ordering_Helpers
Standard.Base.Data.Ordering
Standard.Base.Data.Ordering.Natural_Order
Standard.Base.Data.Text
Standard.Base.Panic
Standard.Base.Error
Standard.Base.Internal.Error_Builtins
Standard.Base.Errors.Wrapped_Error
Standard.Base.Errors.Common
Standard.Base.Internal.Meta_Helpers
Standard.Base.Data.Numbers
Standard.Base.Internal.Number_Builtins
Standard.Base.Internal.Rounding_Helpers
Standard.Base.Errors.Illegal_Argument
Standard.Base.Data.Vector
Standard.Base.Internal.Array_Like_Helpers
Standard.Base.Errors.Unimplemented
Standard.Base.Errors.Problem_Behavior
Standard.Base.Data.Pair
Standard.Base.Data.Maybe
Standard.Base.Data.Array_Proxy
Standard.Base.Data.Array
Standard.Base.Errors.Empty_Error
Standard.Base.Data.Sort_Direction
Standard.Base.Data.Text.Normalization
Standard.Base.Data.Ordering.Vector_Lexicographic_Order
Standard.Base.Errors.Illegal_State
Standard.Base.Errors

Getting down to #12819 (comment) out of original 190 wasn't easy. The transitive graph is pretty dense. One "escape" and the closure is back to 190. I was quite happy when I finally got to this 35 modules. But:

  • can we reduce the transitive closure of core modules (Any, Text, numbers) even more?
  • can we remove (some) Standard.*Error.* modules?
  • shall we remove sorting and ordering?

@hubertp hubertp added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Apr 10, 2025
Comment on lines 150 to 151
- range: The section of the this array to return.
If an `Index_Sub_Range`, then the selection is interpreted following
the rules of that type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this change the documentation is malformed (and in many other places)

Copy link
Member Author

@JaroslavTulach JaroslavTulach Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting down to 35 modules out of original 190 wasn't easy. The transitive graph is pretty dense. One "escape" and the closure is back to 190 (just like happened here https://github.com/enso-org/enso/pull/12641/files#r2038889470). One has to search hard to find out where to "make the cut".

In this particular case I decided that Range (and Index_Sub_Range) shall not be in the transitive closure of base modules like Any, Float, Integer. To verify whether such "cut" makes sense I erased all references to Range from those 35 modules. Malformed documentation is a temporary (and not really important for purposes of this PoC) by product of such erasure.

The important questions to ask are:

  • shall Range & co. be outside of transitive scope of core modules (like Any)?
  • if not, where else shall one "make the cut" and how much additional modules get into the transitive closure?
  • if Range & co. stays out - how do we keep compatible (enough) behavior of existing methods?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we remove the dependency of Range & co from core modules like Any and Text, I would think that many regular programs (tests, utilities) will wind up depending on it anyway. And we don't want to discourage the use of 'extras' like Index_Sub_Range by having the growing transitive closure be a constant worry.

On the other hand, it is a shame that even small programs might depend 190 modules. Your suggestion to replace things like Vector | Array | Range | Index_Sub_Range | Etc | Etc with a single type like Traversible and add conversions to Traversible I think is a very good idea -- then the set of allowed parameters can grow arbitrarily large without affecting the core transitive graph at all. This is good when the new class is something generally useful, like Traversible, and not as good when it's just a special type made for one or two cases where we want to avoid a dependency.

@range (self-> Index_Sub_Range.default_widget self.length)
take : (Index_Sub_Range | Range | Integer) -> Vector Any
take self range:(Index_Sub_Range | Range | Integer)=..First =
take self range:(Any | Integer)=..First =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you intending to keep these changes, or is it just a work in progress?

Copy link
Member Author

@JaroslavTulach JaroslavTulach Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exercise in this PoC was to reduce the transitive closure of core modules (Any, Text, numbers) down from current enormous 190. While doing that radical removals had to be made - only then we could execute a program with only 35 modules loaded.

Now, when we know a transitive closure of 35 modules is achievable, we can look back and:

  • identify the types of "cuts" that had to be made
  • decide whether they are the correct "cuts" or whether a "cut" should be made elsewhere in the transitive graph
  • propose how to mitigate each "type of cut" negative effects

In this particular case I see two problems:

Removal of Range & co.

Follow this link for a discussion on that topic.

Removal of Annotation

The @range annotation line has been removed. In general these annotation lines open "completely new segment" of the transitive closure. They use the Standard.Metadata API and its hierarchy of types, then manipulate with text and even convert to JSON. None of that shall be in the transitive closure of core modules because none of such metadata is needed for execution.

Questions to ask:

  • Can we have an annotation on a method in core modules transitive closure at all?
  • Could the annotation body resolution be delayed?
  • Could we have a method without annotation and add the annotation by other means?

@JaroslavTulach JaroslavTulach changed the title PoC: Getting the Standard.Base.Any.Any closure to 35 modules PoC: Getting Any, Text & numbers closure down to 35 modules Apr 11, 2025
@@ -1,17 +1,11 @@
import project.Data.Pair.Pair
Copy link
Member Author

@JaroslavTulach JaroslavTulach Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"check = [Any, Integer, Float, Text]\n"
);

val check = module.invokeMember(MethodNames.Module.EVAL_EXPRESSION, "check")
Copy link
Member Author

@JaroslavTulach JaroslavTulach Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an error related to LambdaTest. Executing just that one test used to fail. However there was never a failure in the test when it was executed in a batch with other tests. Just try:

sbt:runtime-integration-tests> testOnly *LambdaTest
- allow oversaturation and execute until completion *** FAILED *** (25 milliseconds)
org.enso.interpreter.test.InterpreterException: No_Such_Method.Error
 at <enso>.Test.main(Test:2)
 at org.enso.polyglot.api/org.enso.polyglot.Function.execute(Function.scala:20)
 at org.enso.runtime/org.enso..test.InterpreterRunner.$anonfun$eval$1(InterpreterTest.scala:265)

and the execution fails. However there was never a failure in the CI! The code in question is missing imports, but using * (just like #8852 describes):

main =
    f = x -> y -> w -> z -> x * y + w + z
    f 3 3 10 1

The behavior triggered my attention. Does it mean that by executing some code in advance we can fix these missing imports? Yes, that's how it works! Executing the from Standard.Base.Prelude import all in advance fixes the problem and testOnly *LambdaTest now works.

This observation motivated me to seek for a set of core modules and their transitive closure to import by default.

import java.util.Collection;
import java.util.List;

public final class AnyClosureTest extends TransitiveInfra {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • example of a test to control transitive closure of imports
  • specify a code() - usually few import statements and a main method to return moduleName() type
  • specify minimum and maximum number of transitive dependencies
  • explicitly list disallowModules()

import project.Nothing.Nothing
import project.Panic.Panic
from project.Data.Boolean import Boolean, False, True
from project.Data.Range.Extensions import all
from project.Metadata import Display, Widget
Copy link
Member Author

@JaroslavTulach JaroslavTulach Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These import statements trigger whole new transitive closure and yet they are not needed during CLI runtime execution at all. It would be great to have a way to defer resolution of such symbols until really needed. What about following?

from project.Metadata import ~Display, ~Widget

Usage of ~ is a standard Enso way to express that something is deferred. In this case it would mean to defer the resolution of the Display and Widget symbol. One could use these symbols in the code freely, but their resolution to Standard.Base.Metadata.Display & co. would only be done during runtime, when the code accesses the Display or Widget symbol. The import would not be resolved during IR processing optimistically assuming the desired symbol is going to be there.

  • This is meant as an expert feature for people who know what they are doing.
  • It achieves the goal of minimizing "closure"
  • While making as little change to the source code as possible
  • The change follows the spirit of lazy evaluation as already used in Enso

Possible drawback is: accessing the symbol could trigger compilation (which could even fail). In case of failure the symbol would turn into DataflowError (just like polyglot java import symbols do even these days).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also very much like the concept of the lazy import -- but as @radeusgd noted to me earlier, this might move some warnings/errors from 'compile time' to 'runtime', so it's not a semantically neutral change. If it could be something you turn off in development mode, and on for regular production use, and that only affected when the warning/error appears, that could mitigate that problem. But I don't know what it takes to implement a lazy import, or how general it can be -- that is, could every import be lazy?

Since widgets add extra dependencies that are not needed for CLI usage (and for tests), perhaps they could be configured in a different place. It's nice having them right above the function, but perhaps that's not necessary.

Copy link
Member Author

@JaroslavTulach JaroslavTulach May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • To make this (hardcore) feature even less prominent
  • let's consider using project.FQN
  • so far project.FQN yields an error - e.g. it is going to be fully compatible
  • unlike lazy import with ~ we don't need any syntax changes
  • just change the runtime to properly resolve these project.FQNs
  • regular humans are unlikely to use the project. syntax much, so this is mostly for experts anyway

Copy link

github-actions bot commented May 21, 2025

✨ GUI Checks Results

Summary

  • 🧹 Prettier: ✅ Passed
  • 🧰 GUI Checks: ❌ Failed
  • 📚 Storybook: ❌ Failed

⚠️ Action Required: Please review the failed checks and fix them before merging.

See individual check results for more details.

@@ -71,7 +69,7 @@ type Pair
- index: The location in the pair to get the element from. The index is
also allowed be negative, then the elements are indexed from the back
of the pair, i.e. -1 will correspond to the last element.
@index (t-> Widget.Numeric_Input minimum=0 maximum=t.length-1 display=Display.Always)
@index (t-> project.Metadata.Widget.Numeric_Input minimum=0 maximum=t.length-1 display=project.Metadata.Display.Always)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The c8ff072 shows a possible way to implement "lazy imports" inside of the same library/project.
  • All we need to do is to resolve FQNs that start with project. in a "lazy fashion"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, project keyword is semantically allowed in imports. There is a compiler pass that desugars the project alias. In other words, project keyword is currently reserved for imports only. It could theoretically be used for other purposes outside of import statements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants