Skip to content

Increasing max sequence length and vocab size #26

@kuhanw

Description

@kuhanw

Dear experts,

Thank you for this excellent tutorial. This is one of the first seq2seq tutorials I have read that has really helped me to internalize some of the concepts (and that I could get running off the ground without too much trouble!).

I had question, I am currently working on the first tutorial notebook: "1-seq2seq" and trying to understand the relationship between model performance and sequence length and vocab size. In a real world example it may be possible to control sequence length by limiting the sentences to certain sizes but it would surely not be possible to significantly reduce the vocabulary below a threshold.

Indeed at the end of the tutorial it is suggested to play around with these parameters to observe how training speed and quality degrades.

My question is: how would I best translate the toy model of predicting a random sequence of limited sequence of 2-8 and vocab between 1-10 to a more realistic scenario where the vocab can be thousands of terms?

Currently I am simply trying to extend the problem to predicting a random sequence of numbers between 2-5000 instead of 2-10 and play around with the hyperparameters to figure out which ones will help increase the quality of my results.

Is there any intution towards understanding how the embedding size, # encoder units affect model quality? I already noticed that batch size directly effects the quality when the sequence length increases.

Thank you!

Kuhan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions