Xmixers: A collection of SOTA efficient token/channel mixers

Introduction

This repository aims to implement SOTA efficient token/channel mixers. Any technologies related to non-Vanilla Transformer are welcome. If you are interested in this repository, please join our Discord.

Roadmap

Token Mixers
- Linear Attention
- Linear RNN
- Long Convolution
Channel Mixers

ToDo

Pad embed dim;
- hybrid, transformer, hybrid;
Update use bias to use offset;

Finished Model

Change to xopes

Pretrained weights

GPT
- Doreamonzzz/xmixers_gpt_120m_50b
LLaMA
- Doreamonzzz/xmixers_llama_120m_50b

ToDo

Model

LLaMA.
GPT.

Basic

Add data type for class and function.

Ops

long_conv_1d_op.

Token Mixers

Gtu.

Note

[Feature Add]
[Bug Fix]
[Benchmark Add]
[Document Add]
[README Add]

Name		Name	Last commit message	Last commit date
Latest commit History 304 Commits
benchmarks		benchmarks
configs		configs
evals		evals
examples		examples
scripts		scripts
tests		tests
xmixers		xmixers
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Xmixers: A collection of SOTA efficient token/channel mixers

Introduction

Roadmap

ToDo

Finished Model

Change to xopes

Pretrained weights

ToDo

Model

Basic

Ops

Token Mixers

Note

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Doraemonzzz/xmixers

Folders and files

Latest commit

History

Repository files navigation

Xmixers: A collection of SOTA efficient token/channel mixers

Introduction

Roadmap

ToDo

Finished Model

Change to xopes

Pretrained weights

ToDo

Model

Basic

Ops

Token Mixers

Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages