-
Hi, though the README doc showed a little bit example, I'm still encountering different issues to try RWKV7 with FLA, the different models here are having quite complicated architecture on comparing with transformers, would you please have a subfolder in the repo that offers examples for different models? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 10 replies
-
To my surprise this is the very first discussion |
Beta Was this translation helpful? Give feedback.
-
@lidh15 Hello, we have a unified API |
Beta Was this translation helpful? Give feedback.
-
cuz we created this org since 2025 :-) |
Beta Was this translation helpful? Give feedback.
-
@lidh15 Hi, could you benchmark the generation by the following cmd python benchmark_generation.py --path fla-hub/rwkv7-168M-pile |
Beta Was this translation helpful? Give feedback.
-
It's a known issue that it's slower than official RWKV7. You can find examples here: https://huggingface.co/fla-hub/rwkv7-1.5B-world |
Beta Was this translation helpful? Give feedback.
It's a known issue that it's slower than official RWKV7.
The first reason is triton based group norm and l2norm, which has been fixed.
The second reason is addcmul, fla combine 5 tensor into one tensor and
xr, xw, xk, xv, xa, xg = hidden_states.addcmul(delta, self.x_x.view(6, 1, 1, -1)).unbind(0)
, which is much much slower. However, I'm trying to figure out how to fix this without breakup.You can find examples here: https://huggingface.co/fla-hub/rwkv7-1.5B-world