You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have read the README and searched the existing issues.
System Info
the sp branch code,when transformers version > 4.51.0, sequence_parallel_attention will be registered.but the forward func,_update_causal_mask has a judge like "if self.config._attn_implementation == "flash_attention_2":"。finally,attention mask changes from 2d to 4d,i think it is a bug,can you help me ?