Expansion to Common Loss, Tokenization Strategies #15

CCranney · 2025-02-13T22:00:21Z

CCranney
Feb 13, 2025
Maintainer

Currently, AttentionSmithy focuses primarily on transformer model architecture itself. It branches a little bit out of that area with things like greedy and beam search and generators, but those are exceptional. That is mostly because the loss, tokenization etc. are often dependent on the exact nature of the data one wants to analyze with the model, and should be tailored to each use case. That being said, there's often a lot of overlap in how data is tokenized, or how loss is calculated. I could see us implementing classes that made this easier too.

For some reference on tokenization in the bioinformatics field, I found this paper to be especially insightful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expansion to Common Loss, Tokenization Strategies #15

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Expansion to Common Loss, Tokenization Strategies #15

Uh oh!

CCranney Feb 13, 2025 Maintainer

Replies: 0 comments

CCranney
Feb 13, 2025
Maintainer