You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[VPlan] Fix first-order splices without header mask not using EVL (#146672)
This fixes a buildbot failure with EVL tail folding after #144666:
https://lab.llvm.org/buildbot/#/builders/132/builds/1653
For a first-order recurrence to be correct with EVL tail folding we need
to convert splices to vp splices with the EVL operand.
Originally we did this by looking for users of the header mask and its
users, and converting it in createEVLRecipe.
However after #144666 a FOR splice might not actually use the header
mask if it's based off e.g. an induction variable, and so we wouldn't
pick it up in createEVLRecipe.
This fixes this by converting FOR splices separately in a loop over all
recipes in the plan, regardless of whether or not it uses the header
mask.
I think there was some conflation in createEVLRecipe between what was an
optimisation and what was needed for correctness. Most of the transforms
in it just exist to optimize the mask away and we should still emit
correct code without them. So I've renamed it to make the separation
clearer.
; IF-EVL-NEXT: [[TMP7:%.*]] = zext i32 [[TMP11]] to i64
627
+
; IF-EVL-NEXT: [[TMP8:%.*]] = mul i64 1, [[TMP7]]
628
+
; IF-EVL-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <vscale x 2 x i64> poison, i64 [[TMP8]], i64 0
629
+
; IF-EVL-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <vscale x 2 x i64> [[BROADCAST_SPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
630
+
; IF-EVL-NEXT: [[TMP20]] = add <vscale x 2 x i64> [[VEC_IND]], splat (i64 42)
631
+
; IF-EVL-NEXT: [[TMP15:%.*]] = call <vscale x 2 x i64> @llvm.experimental.vp.splice.nxv2i64(<vscale x 2 x i64> [[VECTOR_RECUR]], <vscale x 2 x i64> [[TMP20]], i32 -1, <vscale x 2 x i1> splat (i1 true), i32 [[PREV_EVL]], i32 [[TMP11]])
; NO-VP-NEXT: [[TMP9:%.*]] = mul nuw i64 [[TMP4]], 2
673
+
; NO-VP-NEXT: [[TMP6:%.*]] = call <vscale x 2 x i64> @llvm.stepvector.nxv2i64()
674
+
; NO-VP-NEXT: [[TMP7:%.*]] = mul <vscale x 2 x i64> [[TMP6]], splat (i64 1)
675
+
; NO-VP-NEXT: [[INDUCTION:%.*]] = add <vscale x 2 x i64> zeroinitializer, [[TMP7]]
676
+
; NO-VP-NEXT: [[TMP10:%.*]] = mul i64 1, [[TMP9]]
677
+
; NO-VP-NEXT: [[BROADCAST_SPLATINSERT:%.*]] = insertelement <vscale x 2 x i64> poison, i64 [[TMP10]], i64 0
678
+
; NO-VP-NEXT: [[BROADCAST_SPLAT:%.*]] = shufflevector <vscale x 2 x i64> [[BROADCAST_SPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
; NO-VP-NEXT: [[VEC_IND:%.*]] = phi <vscale x 2 x i64> [ [[INDUCTION]], %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ]
687
+
; NO-VP-NEXT: [[VECTOR_RECUR:%.*]] = phi <vscale x 2 x i64> [ [[VECTOR_RECUR_INIT]], %[[VECTOR_PH]] ], [ [[TMP12:%.*]], %[[VECTOR_BODY]] ]
688
+
; NO-VP-NEXT: [[TMP12]] = add <vscale x 2 x i64> [[VEC_IND]], splat (i64 42)
689
+
; NO-VP-NEXT: [[TMP13:%.*]] = call <vscale x 2 x i64> @llvm.vector.splice.nxv2i64(<vscale x 2 x i64> [[VECTOR_RECUR]], <vscale x 2 x i64> [[TMP12]], i32 -1)
0 commit comments