You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Adds a property `variable.optimizer` that defaults to `None`
- Adds a `DispatchOptimizer` that scans the list of trainable variables during build,
collects all unique per-variable optimizers, then dispatches the apply/stateless_apply
function to the correct optimizer if applicable.
- Modifies `trainer` so that during the optimizer build stage, checks if any variables
have a custom optimizer attached, and if so inserts a `DispatchOptimizer` to properly
handle them. This allows usage to be hidden from the user.
Context: for large embedding tables, we need special optimizers to be used so that
the tables can be updated in-place, rather than returning large gradients. The layer
will handle setting of the custom optimizers, but we need the trainer to be aware
of them and dispatch the embedding tables to different optimizers appropriately.
0 commit comments