Longformer, Big Bird implementation #5

CCranney · 2025-02-13T18:39:38Z

CCranney
Feb 13, 2025
Maintainer

I actually started a Big Bird implementation already (see src/attention_smithy/attention/BigBirdAttention.py). It has a full test suite and technically performs as it should. BUT I realized during real-life application that my variation of their approach - direct indexing rather than using the gather function - does not play well with large data samples. Thus, the entire point of the implementation was rendered moot.

I'd like to rewrite this at some point to use the gather function, but that would require extensive rewrites. If anyone has any thoughts, let me know.

Longformer employs similar principles, so co-development would probably be a good idea.

CCranney · 2025-03-04T18:49:47Z

CCranney
Mar 4, 2025
Maintainer Author

Big Bird and Longformer both have a distinction in choosing "global" tokens that attend to all other tokens (and gets attended to in turn). It was unclear to me how these were chosen, and in my coding efforts have assumed that specific individual tokens are given global status, selected in advance by the user. However, I have wondered if global token selection could be learned by the model itself. Something to consider in future implementation efforts.

0 replies

CCranney · 2025-03-04T18:50:50Z

CCranney
Mar 4, 2025
Maintainer Author

I would also be interested to see if these implementations could be mixed with Hyena operators, possibly mutually reinforcing each other.

0 replies

CCranney · 2025-05-12T21:44:25Z

CCranney
May 12, 2025
Maintainer Author

Longformer (for encoders, so far) was implemented in #44.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Longformer, Big Bird implementation #5

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Longformer, Big Bird implementation #5

Uh oh!

CCranney Feb 13, 2025 Maintainer

Replies: 3 comments

Uh oh!

CCranney Mar 4, 2025 Maintainer Author

Uh oh!

CCranney Mar 4, 2025 Maintainer Author

Uh oh!

CCranney May 12, 2025 Maintainer Author

CCranney
Feb 13, 2025
Maintainer

CCranney
Mar 4, 2025
Maintainer Author

CCranney
Mar 4, 2025
Maintainer Author

CCranney
May 12, 2025
Maintainer Author