This repo aims at providing a collection of vision linear attention models
- Vmamba
- MambaVision
- Vision-Rwkv add init mode in the future
- GSPN torch & triton kernel
- GroupMamba The Scan module will be integrated in the future
The following requirements should be satisfied
- PyTorch >= 2.5 (CUDA>=12.4)
- Triton >= 3.0
- Mamba-ssm (Manual installation from .whl files)
- einops
- Timm
You can install vla
with pip:
git clone https://github.com/Yiozolm/Vision-linear-attention.git
cd Vision-linear-attention
pip install -e .
Year | Venue | Model | Paper | Code |
---|---|---|---|---|
2024 | NeurIPS | Vmamba | VMamba: Visual State Space Model | official |
2025 | ICLR | MambaVision | MambaVision: A Hybrid Mamba-Transformer Vision Backbone | official |
2025 | ICLR | Vision-Rwkv | Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures | official |
2025 | CVPR | GSPN | Parallel Sequence Modeling via Generalized Spatial Propagation Network | official |
2025 | CVPR | GroupMamba | GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model | official |
We would like to express our deepest respect to Songlin and other maintainers of the fla library.