-
Notifications
You must be signed in to change notification settings - Fork 32
Bnb/disc training fix #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
76efd8b
fix: epoch always started with disc_loss = 0, resulting in disc never…
bnb32 96b151a
only use output features in content loss
bnb32 a722236
save / load check for disc training fix
bnb32 2808f1f
use n_obs from previous epoch for loss running mean
bnb32 55774ee
content loss edit to account for undefined hr_out_features
bnb32 939897f
adding gen loss from previous epoch for running means
bnb32 58bf727
Using running dataframe records of training and validation batch loss…
bnb32 b49456f
Initialize records from history for loaded models
bnb32 6777b97
use index to append loss details record
bnb32 6069637
changed trained_frac naming - doesn't make sense to prefix these with…
bnb32 cdd1edf
missed new method `get_hr_exo_and_loss` from cherry picks
bnb32 acb04b2
added `loss_mean_window` arg, material derivative loss with extremes,…
bnb32 b03a539
fix: wasn't updating running mean `loss_details`
bnb32 41ce39b
material derivative loss test fix
bnb32 f26dc86
typo
bnb32 cee57bd
use epoch mean in history but window for running mean
bnb32 9d6a51d
history indexing catch
bnb32 71c6ef1
floating point error adjustment
bnb32 2cb320b
renaming `_get_batch_loss_details`. moved weight into to start of `tr…
bnb32 7029740
Moving weight initialization to start of training. Use batch_handler.…
bnb32 cb57ef3
Using queue_shape doesn't work since some queues only include high re…
bnb32 48cbd5c
Removed `_get_hr_exo_and_loss`. Renamed `_run_gradient_descent` and `…
bnb32 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go either way on this, but knee jerk reaction is that breaking these lines out into a separate function in a different file just makes the stack trace deeper for little benefit. This function is only called in one place in a different file in a relatively short parent function. Seems like we could leave it as-is for less nesting functions? My gut feeling is that three direct function calls without any logic is portable enough to not be packaged into a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of these extractions are motivated by the work on models with observations. I could delay this until that PR if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess i would have to see that work too, but i'd be shocked if a 14 line function really helps reduce the burden of 3 function calls? I really think we should just call the 3 functions directly. More nested functions reduces docstring quality and makes it way harder to trace args/kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This removes ~50 lines of duplication in the obs branch but we can decide if it's worth doing in that PR.