Skip to content

Applying the Reinformer architecture to minimal RNNs #2

@WhyAreYouJay

Description

@WhyAreYouJay

Hi there,
I have a repository, where we have been trying to get the Reinformer to work with a minimal Recurrent Network.

However, both on Hopper and medium-expert datasets we just cannot reach anywhere near the levels of the reinformer (or the original minimal RNNs for that matter)

I was wondering if you might have an idea as to what could be wrong? I have implemented dropout in the minRNN similar to what you have done for MultiHeadAttention, with some slight improvements, but we are still a ways from the results I expected based on the Reinformer and MinRNN baselines.

If you are curious, the repo can be found here

Either way, cool repo!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions