Skip to content

Commit 7cdbde7

Browse files
authored
[CostModel][X86] getShuffleCost - use processShuffleMasks for all shuffle kinds to legal types (#120599) (#121760)
Now that processShuffleMasks can correctly handle 2 src shuffles, we can completely remove the shuffle kind limits and correctly recognize the number of active subvectors per legalized shuffle - improveShuffleKindFromMask will determine the shuffle kind for each split subvector.
1 parent 1547382 commit 7cdbde7

17 files changed

+606
-606
lines changed

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1698,8 +1698,7 @@ InstructionCost X86TTIImpl::getShuffleCost(
16981698
// We are going to permute multiple sources and the result will be in multiple
16991699
// destinations. Providing an accurate cost only for splits where the element
17001700
// type remains the same.
1701-
if ((Kind == TTI::SK_PermuteSingleSrc || Kind == TTI::SK_PermuteTwoSrc) &&
1702-
LT.first != 1) {
1701+
if (LT.first != 1) {
17031702
MVT LegalVT = LT.second;
17041703
if (LegalVT.isVector() &&
17051704
LegalVT.getVectorElementType().getSizeInBits() ==

llvm/test/Analysis/CostModel/X86/alternate-shuffle-cost.ll

Lines changed: 6 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -294,13 +294,9 @@ define <4 x i64> @test_v4i64_2(<4 x i64> %a, <4 x i64> %b) {
294294
}
295295

296296
define <4 x i64> @test_v4i64_3(<4 x i64> %a, <4 x i64> %b) {
297-
; SSE-LABEL: 'test_v4i64_3'
298-
; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %1 = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 4, i32 1, i32 2, i32 3>
299-
; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <4 x i64> %1
300-
;
301-
; AVX-LABEL: 'test_v4i64_3'
302-
; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 4, i32 1, i32 2, i32 3>
303-
; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <4 x i64> %1
297+
; CHECK-LABEL: 'test_v4i64_3'
298+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 4, i32 1, i32 2, i32 3>
299+
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <4 x i64> %1
304300
;
305301
%1 = shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32> <i32 4, i32 1, i32 2, i32 3>
306302
ret <4 x i64> %1
@@ -333,13 +329,9 @@ define <4 x double> @test_v4f64_2(<4 x double> %a, <4 x double> %b) {
333329
}
334330

335331
define <4 x double> @test_v4f64_3(<4 x double> %a, <4 x double> %b) {
336-
; SSE-LABEL: 'test_v4f64_3'
337-
; SSE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %1 = shufflevector <4 x double> %a, <4 x double> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3>
338-
; SSE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <4 x double> %1
339-
;
340-
; AVX-LABEL: 'test_v4f64_3'
341-
; AVX-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = shufflevector <4 x double> %a, <4 x double> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3>
342-
; AVX-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <4 x double> %1
332+
; CHECK-LABEL: 'test_v4f64_3'
333+
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = shufflevector <4 x double> %a, <4 x double> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3>
334+
; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <4 x double> %1
343335
;
344336
%1 = shufflevector <4 x double> %a, <4 x double> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3>
345337
ret <4 x double> %1

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-codesize.ll

Lines changed: 61 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-latency.ll

Lines changed: 61 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector-sizelatency.ll

Lines changed: 61 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-insert_subvector.ll

Lines changed: 61 additions & 61 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-select-codesize.ll

Lines changed: 69 additions & 69 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-select-latency.ll

Lines changed: 69 additions & 69 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-select-sizelatency.ll

Lines changed: 69 additions & 69 deletions
Large diffs are not rendered by default.

llvm/test/Analysis/CostModel/X86/shuffle-select.ll

Lines changed: 69 additions & 69 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)