Expansion to Common Loss, Tokenization Strategies #15
CCranney
started this conversation in
Base Code improvements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Currently, AttentionSmithy focuses primarily on transformer model architecture itself. It branches a little bit out of that area with things like greedy and beam search and generators, but those are exceptional. That is mostly because the loss, tokenization etc. are often dependent on the exact nature of the data one wants to analyze with the model, and should be tailored to each use case. That being said, there's often a lot of overlap in how data is tokenized, or how loss is calculated. I could see us implementing classes that made this easier too.
For some reference on tokenization in the bioinformatics field, I found this paper to be especially insightful.
Beta Was this translation helpful? Give feedback.
All reactions