Skip to content

[RISCV] Optimize (and (icmp x, 0, eq), (icmp y, 0, eq)) utilizing zicond extension #147627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bababuck
Copy link
Contributor

@bababuck bababuck commented Jul 9, 2025

%1 = icmp x, 0, eq
%2 = icmp y, 0, eq
%3 = and %1, %2

Origionally lowered to:
%1 = seqz x
%2 = seqz y
%3 = and %1, %2

With optimiztion:
%1 = seqz x
%3 = czero.eqz %1, y

The main optimization I add in this MR is:

 (and (i1) f, (setcc c, 0, ne)) -> (select c, f, 0)

@topperc @mshockwave

Looking for some advice on how to proceed with this optimization. The current implementation I have pushed is ugly, namely that it modifies/adds to the hooks in TargetLowering. Ideally, we would perform this optimization later in the lowering (after RISCV DAG combine), probably during instruction selection. However, the optimization relies on knowing the types (the select operands need to be i1), and as far as I'm aware, the nodes are converted to XLEN by the time instruction selection occurs (during legalization). I was wondering if anyone has suggestions to avoid modifying the target lowering hooks.

The shouldNormalizeToSelectSequence() hook I modified so that we would disable select(N0&N1, X, Y) => select(N0, select(N1, X, Y), Y) for this case. I could change my logic to instead recognize the select(select()) sequence instead, but that seems like it would duplicate the code for normalizing the select sequence.

The TLI.hasConditionalZero() I added to disable select Cond, 0, F --> and (not Cond), freeze(F) when Zicond since this optimization undoes my optimization ((and (i1) f, (setcc c, 0, ne)) -> (select c, f, 0)) (and we hang going back and forth).

…ond extension

%1 = icmp x, 0, eq
%2 = icmp y, 0, eq
%3 = and %1, %2

Origionally lowered to:
%1 = seqz x
%2 = seqz y
%3 = and %1, %2

With optimiztion:
%1 = seqz x
%3 = czero.eqz %1, y
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2025

@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-aarch64

Author: Ryan Buchner (bababuck)

Changes
%1 = icmp x, 0, eq
%2 = icmp y, 0, eq
%3 = and %1, %2

Origionally lowered to:
%1 = seqz x
%2 = seqz y
%3 = and %1, %2

With optimiztion:
%1 = seqz x
%3 = czero.eqz %1, y

The main optimization I add in this MR is:

 (and (i1) f, (setcc c, 0, ne)) -> (select c, f, 0)

@topperc @mshockwave

Looking for some advice on how to proceed with this optimization. The current implementation I have pushed is ugly, namely that it modifies/adds to the hooks in TargetLowering. Ideally, we would perform this optimization later in the lowering (after RISCV DAG combine), probably during instruction selection. However, the optimization relies on knowing the types (the select operands need to be i1), and as far as I'm aware, the nodes are converted to XLEN by the time instruction selection occurs (during legalization). I was wondering if anyone has suggestions to avoid modifying the target lowering hooks.

The shouldNormalizeToSelectSequence() hook I modified so that we would disable select(N0&N1, X, Y) => select(N0, select(N1, X, Y), Y) for this case. I could change my logic to instead recognize the select(select()) sequence instead, but that seems like it would duplicate the code for normalizing the select sequence.

The TLI.hasConditionalZero() I added to disable select Cond, 0, F --> and (not Cond), freeze(F) when Zicond since this optimization undoes my optimization ((and (i1) f, (setcc c, 0, ne)) -> (select c, f, 0)) (and we hang going back and forth).


Full diff: https://github.com/llvm/llvm-project/pull/147627.diff

7 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+6-2)
  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+5-3)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+2-2)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+2-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+54)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+4-7)
  • (modified) llvm/test/CodeGen/RISCV/zicond-opts.ll (+7-13)
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index fee94cc167363..1cbddb0f470e6 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -2449,8 +2449,8 @@ class LLVM_ABI TargetLoweringBase {
   /// select(N0|N1, X, Y) => select(N0, select(N1, X, Y, Y)) if it is likely
   /// that it saves us from materializing N0 and N1 in an integer register.
   /// Targets that are able to perform and/or on flags should return false here.
-  virtual bool shouldNormalizeToSelectSequence(LLVMContext &Context,
-                                               EVT VT) const {
+  virtual bool shouldNormalizeToSelectSequence(LLVMContext &Context, EVT VT,
+                                               SDNode *) const {
     // If a target has multiple condition registers, then it likely has logical
     // operations on those registers.
     if (hasMultipleConditionRegisters())
@@ -2462,6 +2462,10 @@ class LLVM_ABI TargetLoweringBase {
       Action != TypeSplitVector;
   }
 
+  // Return true is targets has a conditional zero-ing instruction
+  // i.e. select cond, x, 0
+  virtual bool hasConditionalZero() const { return false; }
+
   virtual bool isProfitableToCombineMinNumMaxNum(EVT VT) const { return true; }
 
   /// Return true if a select of constants (select Cond, C1, C2) should be
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 9ffdda28f7899..691389b9d19f0 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -12050,7 +12050,9 @@ static SDValue foldBoolSelectToLogic(SDNode *N, const SDLoc &DL,
 
   // select Cond, T, Cond --> and Cond, freeze(T)
   // select Cond, T, 0    --> and Cond, freeze(T)
-  if (Cond == F || isNullOrNullSplat(F, /* AllowUndefs */ true))
+  // select Cond, T, 0 is a conditional zero
+  if (Cond == F || (!TLI.hasConditionalZero() &&
+                    isNullOrNullSplat(F, /* AllowUndefs */ true)))
     return matcher.getNode(ISD::AND, DL, VT, Cond, DAG.getFreeze(T));
 
   // select Cond, T, 1 --> or (not Cond), freeze(T)
@@ -12061,7 +12063,7 @@ static SDValue foldBoolSelectToLogic(SDNode *N, const SDLoc &DL,
   }
 
   // select Cond, 0, F --> and (not Cond), freeze(F)
-  if (isNullOrNullSplat(T, /* AllowUndefs */ true)) {
+  if (!TLI.hasConditionalZero() && isNullOrNullSplat(T, /* AllowUndefs */ true)) {
     SDValue NotCond =
         matcher.getNode(ISD::XOR, DL, VT, Cond, DAG.getAllOnesConstant(DL, VT));
     return matcher.getNode(ISD::AND, DL, VT, NotCond, DAG.getFreeze(F));
@@ -12214,7 +12216,7 @@ SDValue DAGCombiner::visitSELECT(SDNode *N) {
     // and we always transform to the left side if we know that we can further
     // optimize the combination of the conditions.
     bool normalizeToSequence =
-        TLI.shouldNormalizeToSelectSequence(*DAG.getContext(), VT);
+        TLI.shouldNormalizeToSelectSequence(*DAG.getContext(), VT, N);
     // select (and Cond0, Cond1), X, Y
     //   -> select Cond0, (select Cond1, X, Y), Y
     if (N0->getOpcode() == ISD::AND && N0->hasOneUse()) {
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 6b7e9357aab5a..d2763a3f74b89 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -28262,8 +28262,8 @@ bool AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters(
   return all_equal(ValueVTs);
 }
 
-bool AArch64TargetLowering::shouldNormalizeToSelectSequence(LLVMContext &,
-                                                            EVT) const {
+bool AArch64TargetLowering::shouldNormalizeToSelectSequence(LLVMContext &, EVT,
+                                                            SDNode *) const {
   return false;
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 89f90ee2b7707..504360ac8b7f5 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -836,7 +836,8 @@ class AArch64TargetLowering : public TargetLowering {
                                        SmallVectorImpl<SDValue> &Results,
                                        SelectionDAG &DAG) const;
 
-  bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const override;
+  bool shouldNormalizeToSelectSequence(LLVMContext &, EVT,
+                                       SDNode *) const override;
 
   void finalizeLowering(MachineFunction &MF) const override;
 
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 5126ab6c31c28..3f73a095d6009 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -2688,6 +2688,31 @@ bool RISCVTargetLowering::mergeStoresAfterLegalization(EVT VT) const {
          (VT.isFixedLengthVector() && VT.getVectorElementType() == MVT::i1);
 }
 
+// Disable normalizing for most cases
+// select(N0&N1, X, Y) => select(N0, select(N1, X, Y), Y) and
+// select(N0|N1, X, Y) => select(N0, Y, select(N1, X, Y))
+// If y == 0 and N0 == setcc(eqz || nez) -> czero (select(N1, X, 0), N0)
+bool RISCVTargetLowering::shouldNormalizeToSelectSequence(LLVMContext &, EVT VT,
+                                                          SDNode *N) const {
+  if (Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps()) {
+    assert(
+        N->getOpcode() == ISD::SELECT &&
+        "shouldNormalizeTooSelectSequence() called with non-SELECT operation");
+    const SDValue &CondV = N->getOperand(0);
+    if (CondV.getOpcode() == ISD::SETCC && isNullConstant(N->getOperand(2))) {
+      ISD::CondCode CondCode = cast<CondCodeSDNode>(CondV.getOperand(2))->get();
+      if (CondCode == ISD::SETNE || CondCode == ISD::SETEQ) {
+        return true;
+      }
+    }
+  }
+  return false;
+}
+
+bool RISCVTargetLowering::hasConditionalZero() const {
+  return Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps();
+}
+
 bool RISCVTargetLowering::isLegalElementTypeForRVV(EVT ScalarTy) const {
   if (!ScalarTy.isSimple())
     return false;
@@ -15731,6 +15756,35 @@ static SDValue performANDCombine(SDNode *N,
   if (SDValue V = reverseZExtICmpCombine(N, DAG, Subtarget))
     return V;
 
+  if (Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps()) {
+    auto IsCzeroCompatible = [](const SDValue &Op0,
+                                const SDValue &Op1) -> bool {
+      if (Op0.getValueType() == MVT::i1 && Op1.getOpcode() == ISD::SETCC &&
+          isNullConstant(Op1.getOperand(1))) {
+        ISD::CondCode CondCode = cast<CondCodeSDNode>(Op1.getOperand(2))->get();
+        return CondCode == ISD::SETNE || CondCode == ISD::SETEQ;
+      }
+      return false;
+    };
+    // (and (i1) f, (setcc c, 0, ne)) -> (select c, f, 0) -> (czero.nez f, c)
+    // (and (i1) f, (setcc c, 0, eq)) -> (select c, 0, f) -> (czero.eqz f, c)
+    // (and (setcc c, 0, ne), (i1) g) -> (select c, g, 0) -> (czero.nez g, c)
+    // (and (setcc c, 0, eq), (i1) g) -> (select c, 0, g) -> (czero.eqz g, c)
+    if (IsCzeroCompatible(N->getOperand(0), N->getOperand(1)) ||
+        IsCzeroCompatible(N->getOperand(1), N->getOperand(0))) {
+      const bool CzeroOp1 =
+          IsCzeroCompatible(N->getOperand(0), N->getOperand(1));
+      const SDValue &I1Op = CzeroOp1 ? N->getOperand(0) : N->getOperand(1);
+      const SDValue &SetCCOp = CzeroOp1 ? N->getOperand(1) : N->getOperand(0);
+
+      ISD::CondCode CondCode =
+          cast<CondCodeSDNode>(SetCCOp.getOperand(2))->get();
+      SDLoc DL(N);
+      const SDValue &Condition = SetCCOp.getOperand(0);
+      return DAG.getNode(ISD::SELECT, DL, MVT::i1, SetCCOp, I1Op, DAG.getConstant(0, DL, MVT::i1));
+    }
+  }
+
   if (SDValue V = combineBinOpToReduce(N, DAG, Subtarget))
     return V;
   if (SDValue V = combineBinOpOfExtractToReduceTree(N, DAG, Subtarget))
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index f67d7f155c9d0..11726fb732ac5 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -598,13 +598,10 @@ class RISCVTargetLowering : public TargetLowering {
   /// this override can be removed.
   bool mergeStoresAfterLegalization(EVT VT) const override;
 
-  /// Disable normalizing
-  /// select(N0&N1, X, Y) => select(N0, select(N1, X, Y), Y) and
-  /// select(N0|N1, X, Y) => select(N0, select(N1, X, Y, Y))
-  /// RISC-V doesn't have flags so it's better to perform the and/or in a GPR.
-  bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const override {
-    return false;
-  }
+  bool shouldNormalizeToSelectSequence(LLVMContext &, EVT VT,
+                                       SDNode *N) const override;
+
+  bool hasConditionalZero() const override;
 
   /// Disables storing and loading vectors by default when there are function
   /// calls between the load and store, since these are more expensive than just
diff --git a/llvm/test/CodeGen/RISCV/zicond-opts.ll b/llvm/test/CodeGen/RISCV/zicond-opts.ll
index 35b06c4f4fb41..394fd8202d0dd 100644
--- a/llvm/test/CodeGen/RISCV/zicond-opts.ll
+++ b/llvm/test/CodeGen/RISCV/zicond-opts.ll
@@ -8,16 +8,14 @@ define i32 @icmp_and(i64 %x, i64 %y) {
 ; RV32ZICOND:       # %bb.0:
 ; RV32ZICOND-NEXT:    or a2, a2, a3
 ; RV32ZICOND-NEXT:    or a0, a0, a1
-; RV32ZICOND-NEXT:    snez a1, a2
 ; RV32ZICOND-NEXT:    snez a0, a0
-; RV32ZICOND-NEXT:    and a0, a0, a1
+; RV32ZICOND-NEXT:    czero.eqz a0, a0, a2
 ; RV32ZICOND-NEXT:    ret
 ;
 ; RV64ZICOND-LABEL: icmp_and:
 ; RV64ZICOND:       # %bb.0:
-; RV64ZICOND-NEXT:    snez a1, a1
 ; RV64ZICOND-NEXT:    snez a0, a0
-; RV64ZICOND-NEXT:    and a0, a0, a1
+; RV64ZICOND-NEXT:    czero.eqz a0, a0, a1
 ; RV64ZICOND-NEXT:    ret
   %3 = icmp ne i64 %y, 0
   %4 = icmp ne i64 %x, 0
@@ -32,21 +30,17 @@ define i32 @icmp_and_and(i64 %x, i64 %y, i64 %z) {
 ; RV32ZICOND:       # %bb.0:
 ; RV32ZICOND-NEXT:    or a2, a2, a3
 ; RV32ZICOND-NEXT:    or a0, a0, a1
-; RV32ZICOND-NEXT:    or a4, a4, a5
 ; RV32ZICOND-NEXT:    snez a1, a2
-; RV32ZICOND-NEXT:    snez a0, a0
-; RV32ZICOND-NEXT:    and a0, a1, a0
-; RV32ZICOND-NEXT:    snez a1, a4
-; RV32ZICOND-NEXT:    and a0, a1, a0
+; RV32ZICOND-NEXT:    czero.eqz a0, a1, a0
+; RV32ZICOND-NEXT:    or a4, a4, a5
+; RV32ZICOND-NEXT:    czero.eqz a0, a0, a4
 ; RV32ZICOND-NEXT:    ret
 ;
 ; RV64ZICOND-LABEL: icmp_and_and:
 ; RV64ZICOND:       # %bb.0:
 ; RV64ZICOND-NEXT:    snez a1, a1
-; RV64ZICOND-NEXT:    snez a0, a0
-; RV64ZICOND-NEXT:    and a0, a1, a0
-; RV64ZICOND-NEXT:    snez a1, a2
-; RV64ZICOND-NEXT:    and a0, a1, a0
+; RV64ZICOND-NEXT:    czero.eqz a0, a1, a0
+; RV64ZICOND-NEXT:    czero.eqz a0, a0, a2
 ; RV64ZICOND-NEXT:    ret
   %4 = icmp ne i64 %y, 0
   %5 = icmp ne i64 %x, 0

Copy link

github-actions bot commented Jul 9, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.h llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/lib/Target/RISCV/RISCVISelLowering.h
View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 691389b9d..cbbaf6841 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -12063,7 +12063,8 @@ static SDValue foldBoolSelectToLogic(SDNode *N, const SDLoc &DL,
   }
 
   // select Cond, 0, F --> and (not Cond), freeze(F)
-  if (!TLI.hasConditionalZero() && isNullOrNullSplat(T, /* AllowUndefs */ true)) {
+  if (!TLI.hasConditionalZero() &&
+      isNullOrNullSplat(T, /* AllowUndefs */ true)) {
     SDValue NotCond =
         matcher.getNode(ISD::XOR, DL, VT, Cond, DAG.getAllOnesConstant(DL, VT));
     return matcher.getNode(ISD::AND, DL, VT, NotCond, DAG.getFreeze(F));
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 3f73a095d..3b4fd9cfe 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -15781,7 +15781,8 @@ static SDValue performANDCombine(SDNode *N,
           cast<CondCodeSDNode>(SetCCOp.getOperand(2))->get();
       SDLoc DL(N);
       const SDValue &Condition = SetCCOp.getOperand(0);
-      return DAG.getNode(ISD::SELECT, DL, MVT::i1, SetCCOp, I1Op, DAG.getConstant(0, DL, MVT::i1));
+      return DAG.getNode(ISD::SELECT, DL, MVT::i1, SetCCOp, I1Op,
+                         DAG.getConstant(0, DL, MVT::i1));
     }
   }
 

@kito-cheng kito-cheng requested a review from Copilot July 9, 2025 02:01
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an optimization to fold an and of two zero-comparisons into a single ZICOND czero.* instruction, reducing instruction count on RISC-V with the ZICOND extension.

  • Update tests to expect czero.eqz instead of separate snez+and sequences
  • Extend TargetLowering with a hasConditionalZero() hook and refine shouldNormalizeToSelectSequence() to enable ZICOND patterns
  • Adjust DAGCombiner and target-lowering implementations (RISCV, AArch64) to honor the new hook and produce fused SELECT nodes for later lowering

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
llvm/test/CodeGen/RISCV/zicond-opts.ll Updated expected assembly to use czero.* sequences
llvm/lib/Target/RISCV/RISCVISelLowering.h Changed shouldNormalizeToSelectSequence signature and added hasConditionalZero()
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Implemented new normalization logic and hasConditionalZero()
llvm/lib/Target/AArch64/AArch64ISelLowering.h Updated shouldNormalizeToSelectSequence signature
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp Added dummy override to match new signature
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Modified boolean-select folding to respect hasConditionalZero()
llvm/include/llvm/CodeGen/TargetLowering.h Added base hasConditionalZero() and updated shouldNormalizeToSelectSequence signature
Comments suppressed due to low confidence (2)

llvm/include/llvm/CodeGen/TargetLowering.h:2453

  • The base signature for shouldNormalizeToSelectSequence changed to include an SDNode* parameter. Please update all target-specific overrides (e.g., X86, PPC, MIPS) to match this three-argument signature so they continue to override rather than hide the base method.
                                               SDNode *) const {

if (Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps()) {
assert(
N->getOpcode() == ISD::SELECT &&
"shouldNormalizeTooSelectSequence() called with non-SELECT operation");
Copy link
Preview

Copilot AI Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the assertion message: "shouldNormalizeTooSelectSequence" should be "shouldNormalizeToSelectSequence".

Suggested change
"shouldNormalizeTooSelectSequence() called with non-SELECT operation");
"shouldNormalizeToSelectSequence() called with non-SELECT operation");

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants