[mlir][Aarch64] Improve i8mm instruction sequence for `vector.contract`

The i8mm lowering for some `vector.contract` ops is currently functionally correct. However, performance wise there is some room for improvement. Looking at the generated asm for an mmt4d with 2x2x8 innermost tile sizes, we get:

```
    1470: 6e180483      mov     v3.d[1], v4.d[0]                                                                                                                                                                           
    1474: 4e006204      tbl     v4.16b, { v16.16b, v17.16b, v18.16b, v19.16b }, v0.16b                                                                                                                                     
    1478: 4e84a462      smmla   v2.4s, v3.16b, v4.16b                                                                                                                                                                      
    147c: 6e024041      ext     v1.16b, v2.16b, v2.16b, #0x8 
```

It calls my attention the `mov` instruction, esp. the indexing from `1` to `0`, the `tbl` and the `ext` instructions. This may not seem a big deal but the problem is really exacerbated when using larger tile sizes. We observed large sequences of `mov` and `ext` instructions all over the place.

We should investigate what is going on and try to fix the problem. My suspicion is that this [zero initialization and insertion](https://github.com/llvm/llvm-project/blob/aafed3408e7269c42f974189198a47eb6dd2fc84/mlir/lib/Dialect/ArmNeon/Transforms/LowerContractionToSMMLAPattern.cpp#L178-L185) for `vecmat` cases might be behind some of these instructions. We should try if using `llvm.undef` fixes part of the problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][Aarch64] Improve i8mm instruction sequence for `vector.contract` #90416

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[mlir][Aarch64] Improve i8mm instruction sequence for vector.contract #90416

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[mlir][Aarch64] Improve i8mm instruction sequence for `vector.contract` #90416