Skip to content

Operator-level paralllization with OpenMP (Coase-grained parllelism) #2811

@imaihal

Description

@imaihal

Running set of multiple operations in parallel ("Operator-level parallelization") as shown in Figure 1 has potential to improve inference time. By using draft PR #2756, we confirmed this parallelization accelerated inference time of some actual models. We will split the PR into smaller PRs for step-by-step reviewing. This issue describes overall plan and status for the PRs.

We introduced new operations which are called ONNXParallelOp and ONNXForkOp.(PR #2810) . These operations are lowered to KrnlParallelOp, KrnlIterateOp, and SCFIfOp. We will create subsequent PR for lowering pass for ONNXParallelOp and ONNXForkOp. By taking this approach, we can use onnx-mlir existing OpenMP implementation and meet requirement about using common framework for threading described in issue #2497.

Figure 1. Operator-level parallelization Figure 2. Implementation
image        image

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions