-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[RISCV] Optimize (and (icmp x, 0, eq), (icmp y, 0, eq)) utilizing zicond extension #147627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ond extension %1 = icmp x, 0, eq %2 = icmp y, 0, eq %3 = and %1, %2 Origionally lowered to: %1 = seqz x %2 = seqz y %3 = and %1, %2 With optimiztion: %1 = seqz x %3 = czero.eqz %1, y
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-backend-aarch64 Author: Ryan Buchner (bababuck) Changes
The main optimization I add in this MR is:
@topperc @mshockwave Looking for some advice on how to proceed with this optimization. The current implementation I have pushed is ugly, namely that it modifies/adds to the hooks in The The Full diff: https://github.com/llvm/llvm-project/pull/147627.diff 7 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index fee94cc167363..1cbddb0f470e6 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -2449,8 +2449,8 @@ class LLVM_ABI TargetLoweringBase {
/// select(N0|N1, X, Y) => select(N0, select(N1, X, Y, Y)) if it is likely
/// that it saves us from materializing N0 and N1 in an integer register.
/// Targets that are able to perform and/or on flags should return false here.
- virtual bool shouldNormalizeToSelectSequence(LLVMContext &Context,
- EVT VT) const {
+ virtual bool shouldNormalizeToSelectSequence(LLVMContext &Context, EVT VT,
+ SDNode *) const {
// If a target has multiple condition registers, then it likely has logical
// operations on those registers.
if (hasMultipleConditionRegisters())
@@ -2462,6 +2462,10 @@ class LLVM_ABI TargetLoweringBase {
Action != TypeSplitVector;
}
+ // Return true is targets has a conditional zero-ing instruction
+ // i.e. select cond, x, 0
+ virtual bool hasConditionalZero() const { return false; }
+
virtual bool isProfitableToCombineMinNumMaxNum(EVT VT) const { return true; }
/// Return true if a select of constants (select Cond, C1, C2) should be
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 9ffdda28f7899..691389b9d19f0 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -12050,7 +12050,9 @@ static SDValue foldBoolSelectToLogic(SDNode *N, const SDLoc &DL,
// select Cond, T, Cond --> and Cond, freeze(T)
// select Cond, T, 0 --> and Cond, freeze(T)
- if (Cond == F || isNullOrNullSplat(F, /* AllowUndefs */ true))
+ // select Cond, T, 0 is a conditional zero
+ if (Cond == F || (!TLI.hasConditionalZero() &&
+ isNullOrNullSplat(F, /* AllowUndefs */ true)))
return matcher.getNode(ISD::AND, DL, VT, Cond, DAG.getFreeze(T));
// select Cond, T, 1 --> or (not Cond), freeze(T)
@@ -12061,7 +12063,7 @@ static SDValue foldBoolSelectToLogic(SDNode *N, const SDLoc &DL,
}
// select Cond, 0, F --> and (not Cond), freeze(F)
- if (isNullOrNullSplat(T, /* AllowUndefs */ true)) {
+ if (!TLI.hasConditionalZero() && isNullOrNullSplat(T, /* AllowUndefs */ true)) {
SDValue NotCond =
matcher.getNode(ISD::XOR, DL, VT, Cond, DAG.getAllOnesConstant(DL, VT));
return matcher.getNode(ISD::AND, DL, VT, NotCond, DAG.getFreeze(F));
@@ -12214,7 +12216,7 @@ SDValue DAGCombiner::visitSELECT(SDNode *N) {
// and we always transform to the left side if we know that we can further
// optimize the combination of the conditions.
bool normalizeToSequence =
- TLI.shouldNormalizeToSelectSequence(*DAG.getContext(), VT);
+ TLI.shouldNormalizeToSelectSequence(*DAG.getContext(), VT, N);
// select (and Cond0, Cond1), X, Y
// -> select Cond0, (select Cond1, X, Y), Y
if (N0->getOpcode() == ISD::AND && N0->hasOneUse()) {
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 6b7e9357aab5a..d2763a3f74b89 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -28262,8 +28262,8 @@ bool AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters(
return all_equal(ValueVTs);
}
-bool AArch64TargetLowering::shouldNormalizeToSelectSequence(LLVMContext &,
- EVT) const {
+bool AArch64TargetLowering::shouldNormalizeToSelectSequence(LLVMContext &, EVT,
+ SDNode *) const {
return false;
}
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 89f90ee2b7707..504360ac8b7f5 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -836,7 +836,8 @@ class AArch64TargetLowering : public TargetLowering {
SmallVectorImpl<SDValue> &Results,
SelectionDAG &DAG) const;
- bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const override;
+ bool shouldNormalizeToSelectSequence(LLVMContext &, EVT,
+ SDNode *) const override;
void finalizeLowering(MachineFunction &MF) const override;
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 5126ab6c31c28..3f73a095d6009 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -2688,6 +2688,31 @@ bool RISCVTargetLowering::mergeStoresAfterLegalization(EVT VT) const {
(VT.isFixedLengthVector() && VT.getVectorElementType() == MVT::i1);
}
+// Disable normalizing for most cases
+// select(N0&N1, X, Y) => select(N0, select(N1, X, Y), Y) and
+// select(N0|N1, X, Y) => select(N0, Y, select(N1, X, Y))
+// If y == 0 and N0 == setcc(eqz || nez) -> czero (select(N1, X, 0), N0)
+bool RISCVTargetLowering::shouldNormalizeToSelectSequence(LLVMContext &, EVT VT,
+ SDNode *N) const {
+ if (Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps()) {
+ assert(
+ N->getOpcode() == ISD::SELECT &&
+ "shouldNormalizeTooSelectSequence() called with non-SELECT operation");
+ const SDValue &CondV = N->getOperand(0);
+ if (CondV.getOpcode() == ISD::SETCC && isNullConstant(N->getOperand(2))) {
+ ISD::CondCode CondCode = cast<CondCodeSDNode>(CondV.getOperand(2))->get();
+ if (CondCode == ISD::SETNE || CondCode == ISD::SETEQ) {
+ return true;
+ }
+ }
+ }
+ return false;
+}
+
+bool RISCVTargetLowering::hasConditionalZero() const {
+ return Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps();
+}
+
bool RISCVTargetLowering::isLegalElementTypeForRVV(EVT ScalarTy) const {
if (!ScalarTy.isSimple())
return false;
@@ -15731,6 +15756,35 @@ static SDValue performANDCombine(SDNode *N,
if (SDValue V = reverseZExtICmpCombine(N, DAG, Subtarget))
return V;
+ if (Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps()) {
+ auto IsCzeroCompatible = [](const SDValue &Op0,
+ const SDValue &Op1) -> bool {
+ if (Op0.getValueType() == MVT::i1 && Op1.getOpcode() == ISD::SETCC &&
+ isNullConstant(Op1.getOperand(1))) {
+ ISD::CondCode CondCode = cast<CondCodeSDNode>(Op1.getOperand(2))->get();
+ return CondCode == ISD::SETNE || CondCode == ISD::SETEQ;
+ }
+ return false;
+ };
+ // (and (i1) f, (setcc c, 0, ne)) -> (select c, f, 0) -> (czero.nez f, c)
+ // (and (i1) f, (setcc c, 0, eq)) -> (select c, 0, f) -> (czero.eqz f, c)
+ // (and (setcc c, 0, ne), (i1) g) -> (select c, g, 0) -> (czero.nez g, c)
+ // (and (setcc c, 0, eq), (i1) g) -> (select c, 0, g) -> (czero.eqz g, c)
+ if (IsCzeroCompatible(N->getOperand(0), N->getOperand(1)) ||
+ IsCzeroCompatible(N->getOperand(1), N->getOperand(0))) {
+ const bool CzeroOp1 =
+ IsCzeroCompatible(N->getOperand(0), N->getOperand(1));
+ const SDValue &I1Op = CzeroOp1 ? N->getOperand(0) : N->getOperand(1);
+ const SDValue &SetCCOp = CzeroOp1 ? N->getOperand(1) : N->getOperand(0);
+
+ ISD::CondCode CondCode =
+ cast<CondCodeSDNode>(SetCCOp.getOperand(2))->get();
+ SDLoc DL(N);
+ const SDValue &Condition = SetCCOp.getOperand(0);
+ return DAG.getNode(ISD::SELECT, DL, MVT::i1, SetCCOp, I1Op, DAG.getConstant(0, DL, MVT::i1));
+ }
+ }
+
if (SDValue V = combineBinOpToReduce(N, DAG, Subtarget))
return V;
if (SDValue V = combineBinOpOfExtractToReduceTree(N, DAG, Subtarget))
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index f67d7f155c9d0..11726fb732ac5 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -598,13 +598,10 @@ class RISCVTargetLowering : public TargetLowering {
/// this override can be removed.
bool mergeStoresAfterLegalization(EVT VT) const override;
- /// Disable normalizing
- /// select(N0&N1, X, Y) => select(N0, select(N1, X, Y), Y) and
- /// select(N0|N1, X, Y) => select(N0, select(N1, X, Y, Y))
- /// RISC-V doesn't have flags so it's better to perform the and/or in a GPR.
- bool shouldNormalizeToSelectSequence(LLVMContext &, EVT) const override {
- return false;
- }
+ bool shouldNormalizeToSelectSequence(LLVMContext &, EVT VT,
+ SDNode *N) const override;
+
+ bool hasConditionalZero() const override;
/// Disables storing and loading vectors by default when there are function
/// calls between the load and store, since these are more expensive than just
diff --git a/llvm/test/CodeGen/RISCV/zicond-opts.ll b/llvm/test/CodeGen/RISCV/zicond-opts.ll
index 35b06c4f4fb41..394fd8202d0dd 100644
--- a/llvm/test/CodeGen/RISCV/zicond-opts.ll
+++ b/llvm/test/CodeGen/RISCV/zicond-opts.ll
@@ -8,16 +8,14 @@ define i32 @icmp_and(i64 %x, i64 %y) {
; RV32ZICOND: # %bb.0:
; RV32ZICOND-NEXT: or a2, a2, a3
; RV32ZICOND-NEXT: or a0, a0, a1
-; RV32ZICOND-NEXT: snez a1, a2
; RV32ZICOND-NEXT: snez a0, a0
-; RV32ZICOND-NEXT: and a0, a0, a1
+; RV32ZICOND-NEXT: czero.eqz a0, a0, a2
; RV32ZICOND-NEXT: ret
;
; RV64ZICOND-LABEL: icmp_and:
; RV64ZICOND: # %bb.0:
-; RV64ZICOND-NEXT: snez a1, a1
; RV64ZICOND-NEXT: snez a0, a0
-; RV64ZICOND-NEXT: and a0, a0, a1
+; RV64ZICOND-NEXT: czero.eqz a0, a0, a1
; RV64ZICOND-NEXT: ret
%3 = icmp ne i64 %y, 0
%4 = icmp ne i64 %x, 0
@@ -32,21 +30,17 @@ define i32 @icmp_and_and(i64 %x, i64 %y, i64 %z) {
; RV32ZICOND: # %bb.0:
; RV32ZICOND-NEXT: or a2, a2, a3
; RV32ZICOND-NEXT: or a0, a0, a1
-; RV32ZICOND-NEXT: or a4, a4, a5
; RV32ZICOND-NEXT: snez a1, a2
-; RV32ZICOND-NEXT: snez a0, a0
-; RV32ZICOND-NEXT: and a0, a1, a0
-; RV32ZICOND-NEXT: snez a1, a4
-; RV32ZICOND-NEXT: and a0, a1, a0
+; RV32ZICOND-NEXT: czero.eqz a0, a1, a0
+; RV32ZICOND-NEXT: or a4, a4, a5
+; RV32ZICOND-NEXT: czero.eqz a0, a0, a4
; RV32ZICOND-NEXT: ret
;
; RV64ZICOND-LABEL: icmp_and_and:
; RV64ZICOND: # %bb.0:
; RV64ZICOND-NEXT: snez a1, a1
-; RV64ZICOND-NEXT: snez a0, a0
-; RV64ZICOND-NEXT: and a0, a1, a0
-; RV64ZICOND-NEXT: snez a1, a2
-; RV64ZICOND-NEXT: and a0, a1, a0
+; RV64ZICOND-NEXT: czero.eqz a0, a1, a0
+; RV64ZICOND-NEXT: czero.eqz a0, a0, a2
; RV64ZICOND-NEXT: ret
%4 = icmp ne i64 %y, 0
%5 = icmp ne i64 %x, 0
|
You can test this locally with the following command:git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.h llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/lib/Target/RISCV/RISCVISelLowering.h View the diff from clang-format here.diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 691389b9d..cbbaf6841 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -12063,7 +12063,8 @@ static SDValue foldBoolSelectToLogic(SDNode *N, const SDLoc &DL,
}
// select Cond, 0, F --> and (not Cond), freeze(F)
- if (!TLI.hasConditionalZero() && isNullOrNullSplat(T, /* AllowUndefs */ true)) {
+ if (!TLI.hasConditionalZero() &&
+ isNullOrNullSplat(T, /* AllowUndefs */ true)) {
SDValue NotCond =
matcher.getNode(ISD::XOR, DL, VT, Cond, DAG.getAllOnesConstant(DL, VT));
return matcher.getNode(ISD::AND, DL, VT, NotCond, DAG.getFreeze(F));
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 3f73a095d..3b4fd9cfe 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -15781,7 +15781,8 @@ static SDValue performANDCombine(SDNode *N,
cast<CondCodeSDNode>(SetCCOp.getOperand(2))->get();
SDLoc DL(N);
const SDValue &Condition = SetCCOp.getOperand(0);
- return DAG.getNode(ISD::SELECT, DL, MVT::i1, SetCCOp, I1Op, DAG.getConstant(0, DL, MVT::i1));
+ return DAG.getNode(ISD::SELECT, DL, MVT::i1, SetCCOp, I1Op,
+ DAG.getConstant(0, DL, MVT::i1));
}
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces an optimization to fold an and
of two zero-comparisons into a single ZICOND czero.*
instruction, reducing instruction count on RISC-V with the ZICOND extension.
- Update tests to expect
czero.eqz
instead of separatesnez
+and
sequences - Extend
TargetLowering
with ahasConditionalZero()
hook and refineshouldNormalizeToSelectSequence()
to enable ZICOND patterns - Adjust
DAGCombiner
and target-lowering implementations (RISCV, AArch64) to honor the new hook and produce fusedSELECT
nodes for later lowering
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
llvm/test/CodeGen/RISCV/zicond-opts.ll | Updated expected assembly to use czero.* sequences |
llvm/lib/Target/RISCV/RISCVISelLowering.h | Changed shouldNormalizeToSelectSequence signature and added hasConditionalZero() |
llvm/lib/Target/RISCV/RISCVISelLowering.cpp | Implemented new normalization logic and hasConditionalZero() |
llvm/lib/Target/AArch64/AArch64ISelLowering.h | Updated shouldNormalizeToSelectSequence signature |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | Added dummy override to match new signature |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | Modified boolean-select folding to respect hasConditionalZero() |
llvm/include/llvm/CodeGen/TargetLowering.h | Added base hasConditionalZero() and updated shouldNormalizeToSelectSequence signature |
Comments suppressed due to low confidence (2)
llvm/include/llvm/CodeGen/TargetLowering.h:2453
- The base signature for shouldNormalizeToSelectSequence changed to include an SDNode* parameter. Please update all target-specific overrides (e.g., X86, PPC, MIPS) to match this three-argument signature so they continue to override rather than hide the base method.
SDNode *) const {
if (Subtarget.hasStdExtZicond() || Subtarget.hasVendorXVentanaCondOps()) { | ||
assert( | ||
N->getOpcode() == ISD::SELECT && | ||
"shouldNormalizeTooSelectSequence() called with non-SELECT operation"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in the assertion message: "shouldNormalizeTooSelectSequence" should be "shouldNormalizeToSelectSequence".
"shouldNormalizeTooSelectSequence() called with non-SELECT operation"); | |
"shouldNormalizeToSelectSequence() called with non-SELECT operation"); |
Copilot uses AI. Check for mistakes.
The main optimization I add in this MR is:
@topperc @mshockwave
Looking for some advice on how to proceed with this optimization. The current implementation I have pushed is ugly, namely that it modifies/adds to the hooks in
TargetLowering
. Ideally, we would perform this optimization later in the lowering (afterRISCV
DAG combine), probably during instruction selection. However, the optimization relies on knowing the types (the select operands need to bei1
), and as far as I'm aware, the nodes are converted to XLEN by the time instruction selection occurs (during legalization). I was wondering if anyone has suggestions to avoid modifying the target lowering hooks.The
shouldNormalizeToSelectSequence()
hook I modified so that we would disableselect(N0&N1, X, Y) => select(N0, select(N1, X, Y), Y)
for this case. I could change my logic to instead recognize the select(select()) sequence instead, but that seems like it would duplicate the code for normalizing the select sequence.The
TLI.hasConditionalZero()
I added to disableselect Cond, 0, F --> and (not Cond), freeze(F)
when Zicond since this optimization undoes my optimization ((and (i1) f, (setcc c, 0, ne)) -> (select c, f, 0)
) (and we hang going back and forth).