Operator-level paralllization with OpenMP (Coase-grained parllelism)

Running set of multiple operations in parallel ("Operator-level parallelization") as shown in Figure 1 has potential to improve inference time.  By using draft PR #2756, we confirmed this parallelization accelerated inference time of some actual models. We will split the PR into smaller PRs for step-by-step reviewing. This issue describes overall plan and status for the PRs. 

We introduced new operations which are called ONNXParallelOp and ONNXForkOp.(PR #2810) . These operations are lowered to KrnlParallelOp, KrnlIterateOp, and SCFIfOp. We will create subsequent PR for lowering pass for ONNXParallelOp and ONNXForkOp.  By taking this approach, we can use onnx-mlir existing OpenMP implementation and meet requirement about using common framework for threading described in issue #2497. 

- PR #2810  (Ready for review)
:
: 

|Figure 1. Operator-level parallelization|Figure 2. Implementation|
|---|---|
|<img width="278" alt="image" src="https://github.com/onnx/onnx-mlir/assets/18550698/d5ff1e73-2ac6-4065-aa9f-92446b9c2688">　　　　|　　　<img width="394" alt="image" src="https://github.com/onnx/onnx-mlir/assets/18550698/ce380df6-b76b-4a1c-b682-2dc2fa1537b1">|


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Operator-level paralllization with OpenMP (Coase-grained parllelism) #2811

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Operator-level paralllization with OpenMP (Coase-grained parllelism) #2811

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions