-
Notifications
You must be signed in to change notification settings - Fork 199
An Enhancement Proposal Workflow #1674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1674 +/- ##
==========================================
- Coverage 87.77% 84.59% -3.19%
==========================================
Files 134 137 +3
Lines 11126 11417 +291
==========================================
- Hits 9766 9658 -108
- Misses 1360 1759 +399
Flags with carried forward coverage won't be shown. Click here to find out more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
The general idea of EPs is great, I really like it! We should also advertise this on the contributing.md
(or even require it for "external" PRs).
I do have doubts about the new abstractions though. For example, we currently have EarlyStopping
. In the future, we would probably also have LRScheduler
, ClippingMethod
, and possibly more. Should we really aim to build the ultimate training loop? I think this has a few downsides:
- all of these features have to be well-documented (otherwise will forever be unused)
- they create more abstractions, which will make it harder for new maintainers
- more testing...
The simple alternative would be to simply point people to the flexible training loop
. Everybody can use an LLM to get, for example, a learning rate scheduler. Of course, this requires a bit more work for users, but getting to know yet another sbi
abstraction (which includes reading and understanding documentation, testing the feature, making sure it works,...) is also non-zero work for users.
|
||
### For Maintainers | ||
|
||
- **Reduced code duplication**: Extract shared training logic into reusable components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with doing this for this PR, but in general, I don't think that code duplication is always a bad thing. Introducing "reusable components" adds cognitive load and will make the package harder to maintain in the future.
|
||
- **Seamless experiment tracking**: Support for TensorBoard, WandB, MLflow, and stdout | ||
without changing existing code | ||
- **Multiple early stopping strategies**: Patience-based, plateau detection, and custom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any evidence that patience-based is not working well? I would only add more methods this if it is really low-effort to implement and maintain other methods or if it they are really important for performance.
learning_rate=5e-4, | ||
max_epochs=1000, | ||
device="cuda" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see the point of this. All of our train config args are ints, floats, or bools. Why do we need this additional abstraction?
) | ||
|
||
# Method-specific loss configuration | ||
loss_args = LossArgsNPE(exclude_invalid_x=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not consider exclude_invalid_x
to live in LossArgsNPE
early_stop = EarlyStopping.validation_loss(patience=20, min_delta=1e-4) | ||
|
||
# Stop when learning rate drops too low | ||
early_stop = EarlyStopping.lr_threshold(min_lr=1e-6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a part of train_config
?
I suggest we adopt an Enhancement Proposal (EP) workflow to better organize and discuss larger refactorings and API changes. EPs should live on the docs website and have a correspondng GH discussions entry for public discussions.
This will help us plan and reach consensus on major changes (e.g., roadmap toward
sbi
1.0, deprecations, architectural refactors), while keeping the process transparent and easy to follow.I created a corresponding entry on the website, an explanation of the workflow and an example EP-01 for the planned training infrastructure refactoring (numfocus/small-development-grant-proposals#60).
In general, I see this as a first step towards a more transparent and professional governance model.
All open for discussions, let's discuss in next week's org meeting.