-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[mlir][linalg] Extend FuseElementwiseOps
pattern to work with named ops
#144922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c76a8cc
fa60fa5
20b25f3
8e471a7
d723913
5280b87
c2f52bc
8d2e8e0
58582bf
cf67ab6
7d402c1
b1d15b2
459bcb4
f7e164b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1014,3 +1014,24 @@ module { | |
// CHECK-DAG: %[[T3:.+]] = arith.addf %[[T2]], %[[B1]] | ||
// CHECK: linalg.yield %[[T3]] : f32 | ||
// CHECK: return %[[GENERIC]] | ||
|
||
// ----- | ||
|
||
func.func @map_ops(%in1: tensor<8xf32>, %in2: tensor<8xf32>) -> tensor<8xf32> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you also add a test where the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sure. I initially didn't go that far with tests because at the time my reasoning was that the logic is unchanged so should generalize. but then I ran across that one issue with the bb args, so I'm less convinced of that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
%fill = tensor.empty() : tensor<8xf32> | ||
%add = linalg.map {arith.addf} ins(%in1, %in2: tensor<8xf32>, tensor<8xf32>) outs(%fill: tensor<8xf32>) | ||
%mapped_65 = linalg.map { math.sqrt } ins(%add : tensor<8xf32>) outs(%fill : tensor<8xf32>) | ||
return %mapped_65 : tensor<8xf32> | ||
} | ||
|
||
// CHECK-LABEL: func @map_ops | ||
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8xf32> | ||
// CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
// CHECK-SAME: ins(%[[ARG0]], %[[ARG1]] : {{.*}}) outs(%[[EMPTY]] : | ||
// CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[OUT:.*]]: f32): | ||
// CHECK-NEXT: %[[ADD:.*]] = arith.addf %[[IN0]], %[[IN1]] | ||
// CHECK-NEXT: %[[SQRT:.*]] = math.sqrt %[[ADD]] | ||
// CHECK-NEXT: linalg.yield %[[SQRT]] | ||
// CHECK-NOT: linalg.generic |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -59,3 +59,154 @@ func.func @handle_unused_operands(%arg0: tensor<8xf32>, %arg1: tensor<8xf32>) -> | |
// CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
// CHECK-SAME: outs(%[[EMPTY]] : | ||
// CHECK-NOT: linalg.generic | ||
|
||
// ----- | ||
|
||
func.func @map_ops(%in1: tensor<8xf32>, %in2: tensor<8xf32>) -> tensor<8xf32> { | ||
%fill = tensor.empty() : tensor<8xf32> | ||
%add = linalg.map {arith.addf} ins(%in1, %in2: tensor<8xf32>, tensor<8xf32>) outs(%fill: tensor<8xf32>) | ||
%sqrt = linalg.map { math.sqrt } ins(%add : tensor<8xf32>) outs(%fill : tensor<8xf32>) | ||
return %sqrt : tensor<8xf32> | ||
} | ||
|
||
// CHECK-LABEL: func @map_ops | ||
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8xf32> | ||
// CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe not for this PR, but we have discussed keeping the named ops for as long as possible. Here, since they're both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yah I mentioned in a comment that I wanted to try to do that, but don't know of a clean way to do that yet. I've done something like that before by just using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also that wouldn't help with elementwise+elementwise -> map anyway There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Right, the idea is that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. one thing to note. |
||
// CHECK-SAME: ins(%[[ARG0]], %[[ARG1]] : {{.*}}) outs(%[[EMPTY]] : | ||
// CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[OUT:.*]]: f32): | ||
// CHECK-NEXT: %[[ADD:.*]] = arith.addf %[[IN0]], %[[IN1]] | ||
// CHECK-NEXT: %[[SQRT:.*]] = math.sqrt %[[ADD]] | ||
// CHECK-NEXT: linalg.yield %[[SQRT]] | ||
// CHECK-NOT: linalg.map | ||
|
||
// ----- | ||
|
||
func.func @map_ops_mixed_types(%arg0: tensor<8xf32>, %arg1: tensor<8xf32>) -> tensor<8xf32> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to fuse There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my thinking is that if it works for the generic form it should work for map form. but again, that block arg oddity kinda ruins that. nevertheless I expect that to work, but will add a test There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
%init = tensor.empty() : tensor<8xi1> | ||
%initf = tensor.empty() : tensor<8xf32> | ||
%0 = linalg.map {math.sqrt} ins(%arg0 : tensor<8xf32>) outs(%initf : tensor<8xf32>) | ||
%1 = linalg.map {math.exp} ins(%arg1 : tensor<8xf32>) outs(%initf : tensor<8xf32>) | ||
%2 = linalg.map ins(%0, %1 : tensor<8xf32>, tensor<8xf32>) outs (%init : tensor<8xi1>) | ||
(%in0 : f32, %in1 : f32) { | ||
%cmp = arith.cmpf olt, %in0, %in1 : f32 | ||
linalg.yield %cmp : i1 | ||
} | ||
%3 = linalg.map { arith.select } ins(%2, %0, %1 : tensor<8xi1>, tensor<8xf32>, tensor<8xf32>) outs(%initf : tensor<8xf32>) | ||
return %3 : tensor<8xf32> | ||
} | ||
|
||
// CHECK-LABEL: func @map_ops_mixed_types | ||
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8xf32> | ||
// CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
// CHECK-SAME: ins(%[[ARG0]], %[[ARG1]] : {{.*}}) outs(%[[EMPTY]] : | ||
// CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[OUT:.*]]: f32): | ||
// CHECK-NEXT: %[[EXP0:.*]] = math.exp %[[IN1]] | ||
// CHECK-NEXT: %[[SQRT0:.*]] = math.sqrt %[[IN0]] | ||
// CHECK-NEXT: %[[EXP1:.*]] = math.exp %[[IN1]] | ||
// CHECK-NEXT: %[[SQRT1:.*]] = math.sqrt %[[IN0]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i think the duplicate computations are an old artifact. these do go away with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is indeed odd. Looks like a bug in the fuser. Could be related to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did make a generic version of this and ran the old version of the pass and got same results to confirm it's a pre-existing thing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. here's the generic version
|
||
// CHECK-NEXT: %[[CMP:.*]] = arith.cmpf olt, %[[SQRT1]], %[[EXP1]] | ||
// CHECK-NEXT: %[[RES:.*]] = arith.select %[[CMP]], %[[SQRT0]], %[[EXP0]] | ||
// CHECK-NEXT: linalg.yield %[[RES]] | ||
// CHECK-NOT: linalg.map | ||
|
||
// ----- | ||
|
||
#identity = affine_map<(d0, d1) -> (d0, d1)> | ||
#bcast = affine_map<(d0, d1) -> (d0)> | ||
func.func @elementwise_ops(%in1: tensor<8xf32>, %in2: tensor<8x10xf32>) -> tensor<8x10xf32> { | ||
%fill = tensor.empty() : tensor<8x10xf32> | ||
%add = linalg.elementwise | ||
kind=#linalg.elementwise_kind<add> | ||
indexing_maps = [#bcast, #identity, #identity] | ||
ins(%in1, %in2: tensor<8xf32>, tensor<8x10xf32>) outs(%fill: tensor<8x10xf32>) -> tensor<8x10xf32> | ||
%sqrt = linalg.elementwise | ||
kind=#linalg.elementwise_kind<sqrt> | ||
indexing_maps = [#identity, #identity] | ||
ins(%add : tensor<8x10xf32>) outs(%fill : tensor<8x10xf32>) -> tensor<8x10xf32> | ||
return %sqrt : tensor<8x10xf32> | ||
} | ||
|
||
// CHECK-DAG: #[[MAP0:.+]] = affine_map<(d0, d1) -> (d0, d1)> | ||
// CHECK-DAG: #[[MAP1:.+]] = affine_map<(d0, d1) -> (d0)> | ||
// CHECK-LABEL: func @elementwise_ops | ||
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<8x10xf32> | ||
// CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8x10xf32> | ||
// CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
// CHECK-SAME: indexing_maps = [#[[MAP1]], #[[MAP0]], #[[MAP0]]] | ||
// CHECK-SAME: ins(%[[ARG0]], %[[ARG1]] : {{.*}}) outs(%[[EMPTY]] : | ||
// CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[OUT:.*]]: f32): | ||
// CHECK-NEXT: %[[ADD:.*]] = arith.addf %[[IN0]], %[[IN1]] | ||
// CHECK-NEXT: %[[SQRT:.*]] = math.sqrt %[[ADD]] | ||
// CHECK-NEXT: linalg.yield %[[SQRT]] | ||
// CHECK-NOT: linalg.map | ||
|
||
// ----- | ||
|
||
func.func @map_multi_ops(%arg0: tensor<8xf32>, %arg1: tensor<8xf32>, %arg2: tensor<8xf32>) -> tensor<8xf32> { | ||
%fill = tensor.empty() : tensor<8xf32> | ||
%add_exp = linalg.map ins(%arg0, %arg1: tensor<8xf32>, tensor<8xf32>) outs(%fill: tensor<8xf32>) | ||
(%in0 : f32, %in1 : f32) { | ||
%add = arith.addf %in0, %in1 : f32 | ||
%exp = math.exp %add : f32 | ||
linalg.yield %exp : f32 | ||
} | ||
%sqrt_mul = linalg.map ins(%add_exp, %arg2 : tensor<8xf32>, tensor<8xf32>) outs(%fill : tensor<8xf32>) | ||
(%in0 : f32, %in1 : f32) { | ||
%sqrt = math.sqrt %in0 : f32 | ||
%mul = arith.mulf %sqrt, %in1 : f32 | ||
linalg.yield %mul : f32 | ||
} | ||
return %sqrt_mul : tensor<8xf32> | ||
} | ||
|
||
// CHECK-LABEL: func @map_multi_ops | ||
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK-SAME: %[[ARG2:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8xf32> | ||
// CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
// CHECK-SAME: ins(%[[ARG0]], %[[ARG1]], %[[ARG2]] : {{.*}}) outs(%[[EMPTY]] : | ||
// CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[IN2:.*]]: f32, %[[OUT:.*]]: f32): | ||
// CHECK-NEXT: %[[ADD:.*]] = arith.addf %[[IN0]], %[[IN1]] | ||
// CHECK-NEXT: %[[EXP:.*]] = math.exp %[[ADD]] | ||
// CHECK-NEXT: %[[SQRT:.*]] = math.sqrt %[[EXP]] | ||
// CHECK-NEXT: %[[MUL:.*]] = arith.mulf %[[SQRT]], %[[IN2]] | ||
// CHECK-NEXT: linalg.yield %[[MUL]] | ||
// CHECK-NOT: linalg.map | ||
|
||
// ----- | ||
|
||
#identity = affine_map<(d0, d1) -> (d0, d1)> | ||
#bcast = affine_map<(d0, d1) -> (d0)> | ||
func.func @map_genric_ops(%arg0: tensor<8xf32>, %arg1: tensor<8x10xf32>) -> tensor<8x10xf32> { | ||
%fill = tensor.empty() : tensor<8x10xf32> | ||
%add = linalg.generic | ||
{indexing_maps = [#bcast, #identity, #identity], iterator_types = ["parallel", "parallel"]} | ||
ins(%arg0, %arg1: tensor<8xf32>, tensor<8x10xf32>) outs(%fill: tensor<8x10xf32>) { | ||
^bb0(%in0: f32, %in1: f32, %out: f32): | ||
%add = arith.addf %in0, %in1 : f32 | ||
linalg.yield %add : f32 | ||
} -> tensor<8x10xf32> | ||
%sqrt = linalg.map { math.sqrt } ins(%add : tensor<8x10xf32>) outs(%fill : tensor<8x10xf32>) | ||
return %sqrt : tensor<8x10xf32> | ||
} | ||
|
||
// CHECK-DAG: #[[MAP0:.+]] = affine_map<(d0, d1) -> (d0, d1)> | ||
// CHECK-DAG: #[[MAP1:.+]] = affine_map<(d0, d1) -> (d0)> | ||
// CHECK-LABEL: func @map_genric_ops | ||
// CHECK-SAME: %[[ARG0:[a-zA-Z0-9]+]]: tensor<8xf32> | ||
// CHECK-SAME: %[[ARG1:[a-zA-Z0-9]+]]: tensor<8x10xf32> | ||
// CHECK: %[[EMPTY:.+]] = tensor.empty() : tensor<8x10xf32> | ||
// CHECK: %[[FUSED_OP:.+]] = linalg.generic | ||
// CHECK-SAME: indexing_maps = [#[[MAP1]], #[[MAP0]], #[[MAP0]]] | ||
// CHECK-SAME: ins(%[[ARG0]], %[[ARG1]] : {{.*}}) outs(%[[EMPTY]] : | ||
// CHECK-NEXT: ^bb0(%[[IN0:.*]]: f32, %[[IN1:.*]]: f32, %[[OUT:.*]]: f32): | ||
// CHECK-NEXT: %[[ADD:.*]] = arith.addf %[[IN0]], %[[IN1]] | ||
// CHECK-NEXT: %[[SQRT:.*]] = math.sqrt %[[ADD]] | ||
// CHECK-NEXT: linalg.yield %[[SQRT]] | ||
// CHECK-NOT: linalg.map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a missed opportunity. IIRC, neither
contract
andelementwise
have that problem. Perhaps we can updatemap
to behave like the new ones first? @javedabsar1 @shahidactThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it needs to be updated, or even better if we deprecate it in favour of
linalg.elementwise
aslinalg.map
is same semantically withlinalg.elementwise
, in factlinalg.elementwise
seems more general, IIRC.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rengolin and I had a discussion of that here. seems like we want to keep
map
and it does support multi-op regions whereaselementwise
does not. Soelementwise
isn't quite more general. They each have something the other doesn't. But yes i agree that this is an odd feature ofmap
, so if we do keep it would be nice to canonicalize this kind of thingThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so i actually found another bug and i think it's because of what i'm doing here. i'm modifying the blocks for producers and consumers. as long as they don't stick around, this works fine. but if the producer has more than one user so that it has to stick around, then this logic here converts it to an invalid op. i'm in the process of confirming this. In any event, it's probably not a good idea to modify the blocks in place like this
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm fine with waiting for that change so we don't have to figure out special logic here
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i fixed the bug here. the problem was what i expected. i made note here that this is a hack and should be changed before merging. So I'll either come up with something better or wait for the
map
op change (if we decide to go with that).I can't actually reproduce the bug with builtin passes. I only discovered it when I applied my own control function for
populateElementwiseOpsFusionPatterns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a bug present is the current fuser, it'd be worth adding such test case to the
TestLinalgElementwiseFusion
. You could extend it with another option flag that adds necessary custom control logic.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. however, since the bug was an artifact of my own changes that I noted are hacky, and those changes will go away before this merges anyway, it would be irrelevant by that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rengolin do you know if anyone will make the change to
map
, or should I do it?