Skip to content

Eliminate the Variable class #262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 76 commits into
base: master
Choose a base branch
from

Conversation

jagerber48
Copy link
Contributor

  • Closes # (insert issue number)
  • Executed pre-commit run --all-files with no errors
  • The change is fully covered by automated unit tests
  • Documented in docs/ as appropriate
  • Added an entry to the CHANGES file

This PR:

  • Eliminates the Variable class and eliminates the AffineScalarFunc.derivatives property.
  • The AffineScalarFunc class is renamed to UFloat with AffineScalarFunc left as a legacy name.
  • The LinearCombo class is refactored into the UCombo class but the architectural role is similar: track the uncertainties of UFloat objects as a lazily expanded combination of objects with uncertainty
  • The UAtom object is introduced as the fundamental element which captures uncertainty. Each UAtom models a random variable with zero mean and unity standard deviation/variance. Each UAtom has a uuid and all UAtom are mutually statistically independent. The expanded version of the UCombo maps dict[UAtom, float].

For more detail/discussion see #251 (comment)

Note that as of the time opening this PR the PR is still a work in progress. Not all elements in the bullet list above are implemented and a good handful of tests are still failing.

@jagerber48 jagerber48 changed the title Feature/linear combo refactor Eliminate the Variable class Aug 12, 2024
@@ -104,9 +105,6 @@ def test_ufloat_fromstr():
"3.4(nan)e10": (3.4e10, float("nan")),
# NaN value:
"nan+/-3.14e2": (float("nan"), 314),
# "Double-floats"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These floats are too large to work with. Prior to this PR the large float value could be saved into the AffineScalarFunc object and displayed, but any arithmetic with such large values would result in an OverflowError. In the new framework even displaying a UFloat the first time requires a calculation, even if it's as simple as sqrt(x**2). So in the new framework the OverflowError appears immediately.

@jagerber48 jagerber48 mentioned this pull request Dec 7, 2024
@jagerber48
Copy link
Contributor Author

All the tests are passing except some having to do with the uarray inverse and pseudo inverse functions. The code for this is pretty complicated. It might make sense to just rewrite the code to convert array-compatible functions to ufloat array compatible functions.

Copy link

codecov bot commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 69.27176% with 173 lines in your changes missing coverage. Please review.

Project coverage is 78.68%. Comparing base (fead532) to head (8b14818).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
uncertainties/unumpy/core.py 14.28% 120 Missing ⚠️
uncertainties/ucombo.py 78.35% 21 Missing ⚠️
tests/test_unumpy.py 0.00% 20 Missing ⚠️
tests/test_ulinalg.py 0.00% 4 Missing ⚠️
uncertainties/core.py 92.59% 4 Missing ⚠️
tests/test_umath.py 89.47% 2 Missing ⚠️
tests/helpers.py 95.00% 1 Missing ⚠️
uncertainties/ops.py 91.66% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (fead532) and HEAD (8b14818). Click for more details.

HEAD has 15 uploads less than BASE
Flag BASE (fead532) HEAD (8b14818)
macos-latest-3.11 1 0
macos-latest-3.12 1 0
ubuntu-latest-3.11 1 0
macos-latest-3.10 1 0
ubuntu-latest-3.12 1 0
ubuntu-latest-3.10 1 0
ubuntu-latest-3.8 1 0
ubuntu-latest-3.9 1 0
macos-latest-3.9 1 0
macos-latest-3.8 1 0
windows-latest-3.10 1 0
windows-latest-3.12 1 0
windows-latest-3.9 1 0
windows-latest-3.11 1 0
windows-latest-3.8 1 0
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #262      +/-   ##
==========================================
- Coverage   86.54%   78.68%   -7.86%     
==========================================
  Files          18       19       +1     
  Lines        1903     1905       +2     
==========================================
- Hits         1647     1499     -148     
- Misses        256      406     +150     
Flag Coverage Δ
macos-latest-3.10 ?
macos-latest-3.11 ?
macos-latest-3.12 ?
macos-latest-3.8 ?
macos-latest-3.9 ?
no-numpy 78.68% <69.27%> (-0.04%) ⬇️
ubuntu-latest-3.10 ?
ubuntu-latest-3.11 ?
ubuntu-latest-3.12 ?
ubuntu-latest-3.8 ?
ubuntu-latest-3.9 ?
windows-latest-3.10 ?
windows-latest-3.11 ?
windows-latest-3.12 ?
windows-latest-3.8 ?
windows-latest-3.9 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jagerber48
Copy link
Contributor Author

jagerber48 commented Dec 13, 2024

Wow all tests are passing now. That's a big milestone for this PR. If folks are interested I'd be curious to hear general feedback on the source code changes in this PR. I tried to do as little wholesale rewriting as possible but did chose to do some of that in unumpy/core.py.

Here's a summary of the changes so far. See #251 for a more thorough discussion of these changes and their motivations.

  • Remove the Variable class.
  • Remove the LinearCombination class.
  • Add the UAtom class. A UAtom models a standard normally distributed random variable (zero mean, unity variance).
  • Add the UCombo class. This class essentially maps UAtoms to float. This class models a linear combination of UAtoms. Exposes a method to extract its own standard deviation and its covariance with other UCombos. This class supports multiplication by floats/ints and additions with other UCombo's. This simple arithmetic support makes uncertainty propagation calculations nice.
  • Swap the names UFloat and AffineScalarFunc. Now UFloat defines the actual class and AffineScalarFunc is an alias (perhaps it could be deprecated later).
  • The UFloat class now stores uncertainty as a UCombo rather than a LinearCombination. Previously uncertainty was roughly recorded as the derivatives of an AffineScalarFunc with respect to Variables. This information was stored and accessed in the derivatives attribute. Now there is no longer a derivatives attribute. There is only error_components which now returns a mapping from UAtom to float indicating the weighting of each UAtom within a UFloat. This is the biggest user-facing conceptual change in this PR.
  • It was relatively straightforward to give UFloat a simple hash. This will resolve Pandas sort_values with multiple columns does not work for AffineScalarFunc #186 and Should `Variable.__eq__` be based on `id()` or a separate identifier? #218 and Implement hash for AffineScalarFunc #219. I'll point out that these conversations were major concrete motivators for this refactor (together with a less concrete desire to simplify the codebase)
  • Previously AffineScalarFunc and Variable could be tagged with a user-readable tag. Now UFloats cannot be tagged, but UAtoms can.
  • Because of the previous architecture there were some confusing behaviors with respect to copying/pickling. In the new framework I think copying and pickling are much more straightforward. The behaviors are different than before, but I argue they are better. You can look at how the tests changed for more details onthis. See also Strange behavior on copying #260.
  • The code in unumpy/core.py responsible for extending np.linalg.inv and np.linalg.pinv to operate sensibly on umatrix or arrays of UFloat were rewritten. The old code (1) relied centrally on AffineScalarFunc.derivatives and (2) was quite confusing to understand. In this case I found it easier to rewrite the code than to understand and modify the old code. The old code could only uncertainty-ify functions that took one array as an argument. The new code detect UFloats or arrays containing UFloat's in any argument. By default it uses numerical differentiation, but the function can be passed functions to calculate analytical derivatives.

Some todo items

  • Check that performance isn't impacted
  • Multiple passes of clean up on the code
  • Some passes on the source code for the tests. Many tests were changed, some dramatically. Some tests were added. I think the tests as they are now are probably ok. I think some checks that this PR can go through would be good, but a more thorough addressing of tests should be done in support of resolving Clean up tests #273 in later PRs.
  • Correct and clean up comments within the source code. The legacy uncertainties code has a lot of comments and documentation. This PR is a part of a thrust to help simplify the source code so that it is a bit more self-documenting. With this, in the places touched by this PR, I want to slim down any comments, or at the very least make sure they are accurate. Especially with the removal of the derivatives attribute, I think there may be comments in lots of places that are no longer accurate. There are probably places I could add some comments.
  • Update the documentation. I haven't touched the docs documentation at all. I think there will need to be a lot of updating there. Off the top of my head I know the docs make reference to the "derivatives of variables" philosophy that is being replaced with the "linear combination of random variables" philosophy. I also know there will be some docs about copying/pickling that need to be updated, and of course docs on derivatives and error_components will need to be updated.
  • Bikeshed the naming of the UFloat, UCombo, and UAtom classes.
  • Write the changelog entries

Other notes:

  • COMMENT FOR REVIEWERS: Obviously I would appreciate wholistic reviews, But I'm especially curious for reviews on changes to the tests that start to illustrate how the public API is changing.
  • See this change. The to_uarray_func conversion function assumes inputs are numpy arrays and doesn't "respect" numpy matrices. I made this choice because of the comment on numpy.matrix documentation discouraging their use. See 4.x Release plans #244 (comment). It seems like the maintainers are in some agreement that we can drop matrix support, so maybe consider this a step in that direction.
  • I'll update this comment or later ones as I think of more changes or todos. I'll likely use this comment as a start for the changelog.

- [ ] Closes # (insert issue number)
- [ ] Executed `pre-commit run --all-files` with no errors
- [ ] The change is fully covered by automated unit tests
- [ ] Documented in docs/ as appropriate
- [ ] Added an entry to the CHANGES file
- [ ] Closes # (insert issue number)
- [ ] Executed `pre-commit run --all-files` with no errors
- [ ] The change is fully covered by automated unit tests
- [ ] Documented in docs/ as appropriate
- [ ] Added an entry to the CHANGES file

NOTE!!!! This PR targets `main`, not `master`. This PR will be released
in version `4.x` (likely `4.0.0`).

This PR eliminates functions, methods, and their corresponding tests
that were deprecated in version 3.2.3:
https://github.com/lmfit/uncertainties/releases/tag/3.2.3.
- [ ] Closes # (insert issue number)
- [ ] Executed `pre-commit run --all-files` with no errors
- [ ] The change is fully covered by automated unit tests
- [ ] Documented in docs/ as appropriate
- [ ] Added an entry to the CHANGES file

This PR targets `main`, not `master`.
- [ ] Closes # (insert issue number)
- [ ] Executed `pre-commit run --all-files` with no errors
- [ ] The change is fully covered by automated unit tests
- [ ] Documented in docs/ as appropriate
- [ ] Added an entry to the CHANGES file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants