-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Lazy load legacy imports in skimage
top module
#6892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
These functions are advertised in skimage's docstring. I don't really see a reason why `data_dir` must be public in the root namespace. So keep it hidden.
d1c8afe
to
1987eee
Compare
Hrm, I think this is the opposite of what I thought it would be like; we don't want those functions visible in |
Oh, then I misunderstood you? To avoid further confusion: Do you want to That might be more tricky to achieve. I think most code completion tools pick up whats imported in |
Curiously, this still doesn't include __version__ in skimage.__all__. Maybe, this is a bug or behavior in lazy_loader that should be updated?
I'm not sure whether I discovered a bug here in lazy_loader: even though Edit: E.g. with the current state |
This comment was marked as outdated.
This comment was marked as outdated.
Probably because this is defined in the |
This is pretty close to being ready; @lagru want to push it over the line? |
Okay import skimage
import sys
for x in sys.modules.keys():
if "skimage" in x:
print(x) only returns
now. And |
Hmm, one of the changes doesn't seem to play well with |
I'm not sure why the order changed. Likely changing the import behavior to use lazy loading fully at the top-level of `skimage`, changes the order in which candidates are found by `lookfor` as well. Try if this addresses all test failures.
6239648
to
b4068da
Compare
Hmm, this actually reveals new aspects. Currently, functions such as I'm still considering how to address this. So let's remove this from the 0.25 for now. |
attach_stub will include every import it finds in the PYI file. However, we no longer want to advertise utility functions in the top-level namespace. This seems to be a workaround, that keeps things like `skimage.img_as_float` usable at runtime, but doesn't expose them via `__all__` or `__dir__`. Additionally, this avoids listing these utility functions twice in different places of our HTML documentation with Sphinx.
If .py and .pyi file were to define different __all__ (by using the one returned by `attach_stub`), different tools might behave inconsistent, depending on which file they prefer. Let's avoid that by being explicit. Let's raise this issue upstream with lazy_loader too.
skimage/__init__.py
Outdated
# `attach_stub` currently ignores __all__ inside the stub file and simply | ||
# returns every lazy-imported object, so we need to define `__all__` again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See scientific-python/lazy-loader#133 for a proposal to address this duplication.
@jarrodmillman, I added it to the 0.25 milestone because it should be ready to merge now. But feel very welcome to bump it back to 0.26 again, if you'd rather just prioritize the release. 😊 |
skimage/__init__.py
Outdated
'segmentation', | ||
'transform', | ||
'util', | ||
'__version__', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is __doc__
going away? I recently read something about __doc__
changing soon, but maybe it was just its parsing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see where __doc__
was previously included? Am I misunderstanding?
Not sure, but I don't think we need to advertise __doc__
in __all__
. It's an implementation detail of Python and it's available at runtime for all intents and purposes, e.g., help(skimage)
.
I just put __version__
in here because it might be useful to advertise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wasn't referring to anything 'previously included' or not and, in the context of this PR, I should have!
I was thinking 'in the absolute,' i.e., when I
import skimage as ski
and I run
ski.[tab]
which objects do I expect to be available?
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
The objects available via both functions should be consistent. Approach inspired by https://discuss.python.org/t/4202/.
Not sure if the previous approach would ever terminate the opened process. I feel like using a dedicated function and `check_output` simplifies things a lot and also keeps the global namespace cleaner.
It seems easier to import PytestTester as a private symbol from the start instead of deleting it manually.
Okay, I fixed the inconsistency between |
skimage/__init__.py
Outdated
"""Add lazy-loaded attributes to keep consistent with `__getattr__`.""" | ||
patched_dir = {*globals().keys(), *__lazy_dir__()} | ||
return sorted(patched_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is fixed by this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lazy loading expands the module attributes that are available via __getattr__
. These new attributes are missing from the default __dir__()
/ dir(skimage)
. If we use __dir__
returned from attach_stub
, then a whole bunch of pre-existing attributes like __file__
, __doc__
, etc. are missing. Both are inconsistent and it seemed like a good opportunity to preempt weird or confusing behavior by code inspection tools, etc. that don't expect messing with __dir__
.
Spelling this out, I realize that this is also a problem with our subpackages that just use __dir__
from attach_stub
as is. Not sure if it's worth it to go through all these. Neither am I sure how to conveniently address this from lazy_loader's site. globals()
returns the scope from where the function is defined, so lazy_loader would need a different way to access the correct global scope...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want to expose things like __file__
etc.? __dir__
looks exactly like what I would expect atm:
In [1]: dir(ski)
Out[1]:
['__version__',
'color',
'data',
'draw',
'exposure',
'feature',
'filters',
'future',
'graph',
'io',
'measure',
'metrics',
'morphology',
'registration',
'restoration',
'segmentation',
'transform',
'util']
You can use various hacks to get hold of other attributes, but these are the ones we want our users to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really need to expose more (and I don't think we should), then you could:
def __dir__():
return [some stuff] + __lazy_dir__()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want to expose things like file etc.?
For the reasons stated above. Hiding builtin attributes feels very magical and unexpected to me, and like something that might lead confusing bugs. I'd like tab-completion in ipython to show me everything that's available and not a curated list.
Everyone is aware that _
means private and is used to __builtin_stuff__
. Why would we need to hide these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I do recognize all your arguments, I just don't find them as convincing as you do.
This is a common problem I run into 🙃
Since you'd like to see a change in behavior, I think you should make a convincing case for (a) why this is useful (and not confusing) to the user and (b) that it does not introduce unnecessary complexity (we could improve the code to make that the case, so I'm not too worried about this part).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe more importantly, we are way overdue on the release, so a friendly request that we merge this PR without the change to __dir__
and then do additional follow-up once we come to an agreement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added it as a talking point to today's meeting notes.
Anyway, I'm happy to iterate on this particular issue in another PR. And I'm not seeing this issue as a blocker for the release in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted to status quo in 0a275cd with regards to hiding stuff in __dir__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python's docs seem to be compatible with your stance @stefanv:
The resulting list is not necessarily complete and may be inaccurate when the object has a custom
__getattr__()
.
[...] it attempts to produce the most relevant, rather than complete, information: [...]
Because
dir()
is supplied primarily as a convenience for use at an interactive prompt, it tries to supply an interesting set of names more than it tries to supply a rigorously or consistently defined set of names, and its detailed behavior may change across releases.
Co-authored-by: Stefan van der Walt <stefanv@berkeley.edu> Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
This flag used to be set in setup.py, which is no longer the case. So it should be safe to remove. Co-authored-by: Stefan van der Walt <stefanv@berkeley.edu>
There are no other good ways to get a list of subpackags and their descriptions locally. Also make sure that the newly documented attribute `__version__` isn't confused up by the documentation tooling with the actual line where `__version__` is defined.
8ea157d
to
6d7a770
Compare
There's no consensus on whether builtin private attributes like `__file__`, etc. should be returned by __dir__. So keep the status quo in that regard but still hide former utility functions we no longer want to advertise.
Thanks Lars! |
Description
Closes #6891
Closes #6904
Let me know if I should enable lazy loading for other submodules as well.
Checklist
./doc/examples
(new features only)./benchmarks
, if your changes aren't covered by anexisting benchmark
For reviewers
later.
__init__.py
.doc/release/release_dev.rst
.example, to backport to v0.19.x after merging, add the following in a PR
comment:
@meeseeksdev backport to v0.19.x
run-benchmark
label. To rerun, the labelcan be removed and then added again. The benchmark output can be checked in
the "Actions" tab.