Skip to content

The AllocOptPass is potentially slow (and runs too often) in some cases #54524

@KristofferC

Description

@KristofferC

Looking a bit into why #54520 caused such a big latency improvement, the issue (or at least one issue) is that with the broadcast code a large majority of the time is spent in our own alloc optimization pass:

image

image

As can be seen, this pass runs four times, each time taking quite a long time. The time spent in this pass almost completely disappears when the broadcasting code is replaced with a loop (as was done in #54520).

It might be possible with some latency gains here if one can:

  • Figure out the reason this pass seems to have such a problem with the broadcasting code
  • Understanding if it really is required to run the pass four times.

Metadata

Metadata

Assignees

No one assigned

    Labels

    broadcastApplying a function over a collectioncompiler:codegenGeneration of LLVM IR and native codeperformanceMust go faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions