Skip to content

Add customizable dropout layer with compile-time rate specification #3000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 29, 2024

Conversation

Cydral
Copy link
Contributor

@Cydral Cydral commented Aug 26, 2024

This PR introduces a new customizable dropout layer, dropout_custom_, which allows specifying the dropout rate at compile-time. This enhancement is particularly beneficial for deep neural networks with numerous layers, where manually setting different dropout rates for each layer can be cumbersome.

Key features and benefits:

  1. Compile-time dropout rate specification: Allows for clearer and more concise network definitions.
  2. Inherits from the existing dropout_ class: Maintains all functionality of the original dropout layer.
  3. Template-based implementation: Provides type-safety and potential performance benefits.
  4. Includes a pre-defined dropout_10 alias: Offers a convenient 10% dropout option for common use cases.

@arrufat
Copy link
Contributor

arrufat commented Aug 26, 2024

This is not good enough?

visit_computational_layers(net, [](dropout_& l){ l = dropout_(0.1); });

@Cydral
Copy link
Contributor Author

Cydral commented Aug 27, 2024

No, because it doesn't allow you to precisely identify the layer to be modified. In an LLM-type network (but this would also be true for a convolution-based network for specific image processing), it may be necessary to add layers whose outputs are filtered by a dropout of a different rate.

@Cydral
Copy link
Contributor Author

Cydral commented Aug 27, 2024

However, if we don't want to break the interface and to make the layer more flexible, we can also add a template parameter to the dropout_ layer, keep the dropout instantiation with a default parameter of 0.5 and add a dropout_c (c for custom) to specify the rate when defining the layer.

@davisking
Copy link
Owner

Yeah this is cool the way it is. How about a different name for it though, maybe dropout_at_rate or dropout_rate maybe? IDK, what do you guys think?

@Cydral
Copy link
Contributor Author

Cydral commented Aug 28, 2024

"dropout_rate" seems pretty good. I'll rename the new class accordingly if that's OK with you.
Note: the update is normally already visible in the branch to be merged.

@davisking davisking merged commit 27a0135 into davisking:master Aug 29, 2024
10 checks passed
@davisking
Copy link
Owner

Thanks, this is great :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants