Skip to content

Commit 0a30c42

Browse files
robert-kalmarskywallPop-kornjiriocroman-janik-nxp
authored
NXP Backend: Add eIQ Neutron Backend (#10196)
Co-authored-by: Lukas Sztefek <lukas.sztefek@nxp.com> Co-authored-by: Martin Pavella <martin.pavella@nxp.com> Co-authored-by: Jiri Ocenasek <jiri.ocenasek@nxp.com> Co-authored-by: Roman Janik <roman.janik@nxp.com> Co-authored-by: Simon Strycek <simon.strycek@nxp.com>
1 parent fa2e1f2 commit 0a30c42

File tree

365 files changed

+34051
-9
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

365 files changed

+34051
-9
lines changed

LICENSE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ Copyright 2023 Arm Limited and/or its affiliates.
77
Copyright (c) Qualcomm Innovation Center, Inc.
88
Copyright (c) 2023 Apple Inc.
99
Copyright (c) 2024 MediaTek Inc.
10+
Copyright 2023 NXP
1011

1112
Redistribution and use in source and binary forms, with or without modification,
1213
are permitted provided that the following conditions are met:

backends/arm/_passes/arm_pass_manager.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@
6565
DecomposeScaledDotProductAttention,
6666
)
6767
from executorch.backends.transforms.fuse_view_copy import FuseViewCopyTransform
68-
from executorch.backends.xnnpack._passes.remove_getitem_op import RemoveGetItemPass
68+
from executorch.backends.transforms.remove_getitem_op import RemoveGetItemPass
6969
from executorch.exir import ExportedProgram
7070
from executorch.exir.pass_manager import PassManager
7171
from torch.fx import GraphModule

backends/nxp/README.md

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,45 @@ In the future the NXP eIQ Neutron Backend will be extended to support [i.MX 9 Ap
2727
with eIQ Neutron NPU, like the [i.MX 95](https://www.nxp.com/products/iMX95).
2828

2929

30-
## Layout
31-
TBD
32-
3330
## Backend Status and Maturity
3431
**Current Status:** Prototype Quality
3532

36-
The eIQ Neutron NPU Backend should be considered as prototype quality at this moment. Subject to significant changes and
37-
improvements. NXP and the ExecuTorch community is actively developing this codebase.
33+
The eIQ Neutron NPU Backend should be considered as prototype quality at this moment. Subject to significant changes and
34+
improvements. NXP and the ExecuTorch community is actively developing this codebase.
35+
36+
## Neutron Backend implementation and SW architecture
37+
Neutron Backend uses the eIQ Neutron Converter as ML compiler to compile the delegated subgraph to Neutron microcode.
38+
The Neutron Converter accepts the ML model in LiteRT format, for the **eIQ Neutron N3** class therefore the Neutron Backend uses the LiteRT flatbuffers format as IR between the ExecuTorch and Neutron Converter ML compiler.
39+
40+
The Neutron Backend in its early prototype phase, is based on existing NXP products, such as
41+
onnx2tflite, known from the NXP's eIQ Toolkit.
42+
The **onnx2tflite** is a converter from the ONNX format to LiteRT (formerly known as TFLite).
43+
It consists of 3 stages:
44+
* ONNX Model Parsing
45+
* Tensor Format Inference, to identify tensors using channel-first layer
46+
* ONNX to LiteRT Conversion
47+
* Optimization Passes, which operate on top of the LiteRT format
48+
* LiteRT Serialization
49+
50+
Due to the similarities between ONNX to LiteRT and Edge to LiteRT conversion, the Neutron Backend's
51+
currently leverages the Tensor format Inference and LiteRT Optimizer.
52+
This shall be considered as temporary solution, intended to be replaced with:
53+
* Dim Order (https://github.com/pytorch/executorch/issues/4873)
54+
* Corresponding ExecuTorch/ATen passes
55+
56+
before reaching higher maturity status by the end of 2025.
57+
58+
## Layout
59+
The current code base is as follows:
60+
* `backend/ir/` - TFLite/LiteRT based IR to represent the Edge Subgraph, taken from onnx2tflite code base and extended to
61+
support Edge Dialect to LiteRT conversion.
62+
* `backend/ir/converter` - Neutron Backends conversion from Edge (ATen) Dialect to LiteRT, TFLite. The subfolder
63+
`node_conveters` is structured as single module for each Edge operator.
64+
* `backend/ir/lib` - automatically generated handlers from LiteRT flatbuffers schema
65+
* `backend/ir/tflite_generator` and `backend/ir/tflite_optimizer` handle the serialization
66+
of the in-memory built subgraph for delegation into LiteRT/TFLite flatbuffers
67+
representation. Code taken from the onnx2tflite tool.
68+
* `quantizer` - Neutron Backends quantizer implementation.
3869

3970
## Help & Improvements
4071
If you have problems or questions or have suggestions for ways to make

backends/nxp/backend/edge_helper.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Copyright 2024 NXP
2+
#
3+
# This source code is licensed under the BSD-style license found in the
4+
# LICENSE file in the root directory of this source tree.
5+
6+
import torch
7+
from torch.fx import Node
8+
9+
10+
def input_tensor(node: Node, input_index: int) -> torch.Tensor:
11+
if len(node.all_input_nodes) <= input_index:
12+
raise IndexError
13+
14+
return node.all_input_nodes[input_index].meta["val"]
15+
16+
17+
def output_tensor(node: Node) -> torch.Tensor:
18+
return node.meta["val"]
19+
20+
21+
def tensor_rank(tensor: torch.Tensor) -> int:
22+
return len(tensor.size())
23+
24+
25+
def input_rank(node: Node, input_index: int) -> int:
26+
return tensor_rank(input_tensor(node, input_index))
27+
28+
29+
def input_tensor_safe(node: Node, input_index: int) -> torch.Tensor | None:
30+
"""Return the input tensor of 'node' at index 'input_index', or None if the node doesn't have that input.
31+
32+
:param node: Edge node to get the input tensor from.
33+
:param input_index: Index of the input tensor to get.
34+
:return: The input tensor at index 'input_index', or None.
35+
"""
36+
37+
if len(node.all_input_nodes) <= input_index:
38+
return None
39+
40+
return input_tensor(node, input_index)
Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
# Copyright 2024 NXP
2+
#
3+
# This source code is licensed under the BSD-style license found in the
4+
# LICENSE file in the root directory of this source tree.
5+
6+
import executorch.backends.nxp.backend.ir.logger as logger
7+
import flatbuffers
8+
from executorch.backends.nxp.backend.ir.conversion_config import ConversionConfig
9+
from executorch.backends.nxp.backend.ir.conversion_context import ConversionContext
10+
from executorch.backends.nxp.backend.ir.converter.builder.aten_model_builder_director import (
11+
AtenModelBuilderDirector,
12+
)
13+
from torch.export import ExportedProgram
14+
from torch.export.graph_signature import InputKind
15+
from torch.fx import Node
16+
from torch.nn.parameter import Parameter
17+
from executorch.backends.nxp.backend.ir.converter.node_converters.ops_converters import * # noqa F403
18+
from executorch.backends.nxp.backend.node_format_inference import (
19+
NodeFormat,
20+
NodeFormatInference,
21+
)
22+
from executorch.exir.dialects._ops import ops as exir_ops
23+
24+
# noinspection PyProtectedMember
25+
functions_converters = {
26+
exir_ops.edge.aten.addmm.default: AddMMConverter, # noqa F405
27+
exir_ops.edge.aten.avg_pool2d.default: AvgPool2dConverter, # noqa F405
28+
exir_ops.edge.aten.constant_pad_nd.default: ConstantPadNDConverter, # noqa F405
29+
exir_ops.edge.aten.convolution.default: ConvolutionConverter, # noqa F405
30+
exir_ops.edge.aten.max_pool2d.default: MaxPool2dConverter, # noqa F405
31+
exir_ops.edge.aten.mm.default: MMConverter, # noqa F405
32+
exir_ops.edge.aten.permute_copy.default: PermuteCopyConverter, # noqa F405
33+
exir_ops.edge.aten.relu.default: ReLUConverter, # noqa F405
34+
exir_ops.edge.aten._softmax.default: SoftmaxConverter, # noqa F405
35+
exir_ops.edge.aten.view_copy.default: ViewCopyConverter, # noqa F405
36+
}
37+
38+
39+
class EdgeProgramToIRConverter:
40+
"""
41+
Converter from convertion of ExportedProgram in Edge dialect to IR (TFLite Flatbuffers).
42+
"""
43+
44+
_default_conversion_config = ConversionConfig()
45+
46+
def convert_program(
47+
self,
48+
edge_program: ExportedProgram,
49+
conversion_config=_default_conversion_config,
50+
) -> (bytes, dict):
51+
"""
52+
Convert ExportedProgram in Edge dialect to IR (TFLite flatbuffers) as bytes.
53+
54+
:param edge_program: Converter ExportedProgram.
55+
:param conversion_config: ConversionConfig instance.
56+
:return: TFLite flatbuffers as bytes.
57+
"""
58+
node_formats = NodeFormatInference(edge_program).identify_node_formats()
59+
parameters_mapping = self.map_inputs_to_parameters(edge_program)
60+
61+
cc = self.build_conversion_context(
62+
parameters_mapping, node_formats, conversion_config
63+
)
64+
65+
# Program conversion
66+
self.append_placeholders_and_tensors(edge_program.graph.nodes, cc)
67+
self._convert_qdq_cluster_q_dq_nodes(edge_program.graph.nodes, cc)
68+
self._process_nodes(edge_program.graph.nodes, cc)
69+
70+
# Assign output
71+
io_formats = cc.tflite_builder.assign_model_io_to_subgraph_and_get_io_formats(
72+
edge_program.graph_signature
73+
)
74+
75+
# TFLite model generation
76+
internal_tflite_model = cc.tflite_builder.finish()
77+
flatbuffers_builder = flatbuffers.Builder()
78+
internal_tflite_model.gen_tflite(flatbuffers_builder)
79+
80+
return bytes(flatbuffers_builder.Output()), io_formats
81+
82+
@staticmethod
83+
def append_placeholders_and_tensors(nodes: list[Node], context: ConversionContext):
84+
for node in nodes:
85+
if node.op == "placeholder":
86+
node_format = context.node_formats[node]
87+
88+
if node.name in context.parameters_mapping:
89+
# Node is placeholder and has data -> append as static tensor with data
90+
tensor = context.parameters_mapping[node.name]
91+
context.tflite_builder.append_as_static_tensor(
92+
node, node_format, tensor
93+
)
94+
else:
95+
# Node is placeholder and doesn't have data (user input) -> append as fake tensor
96+
context.tflite_builder.append_as_fake_tensor(node, node_format)
97+
elif node.op == "call_function":
98+
# Node is call function -> append only output as a tensor
99+
node_format = context.node_formats[node]
100+
context.tflite_builder.append_as_fake_tensor(node, node_format)
101+
elif node.op == "output":
102+
# Nothing to do
103+
pass
104+
else:
105+
logger.e(
106+
logger.Code.INTERNAL_ERROR, f"Unexpected node op type: '{node.op}'!"
107+
)
108+
109+
def _process_nodes(self, nodes: list[Node], conversion_context: ConversionContext):
110+
"""
111+
Go through program nodes and append their TFLite siblings into ModelBuilder.
112+
113+
:param nodes: Program's nodes.
114+
:param conversion_context: ConversionContext instance.
115+
"""
116+
117+
qdq_related_functions = [
118+
exir_ops.edge.quantized_decomposed.dequantize_per_tensor.default,
119+
exir_ops.edge.quantized_decomposed.quantize_per_tensor.default,
120+
]
121+
122+
for node in nodes:
123+
if node.op == "call_function":
124+
if node.target in qdq_related_functions and "cluster" in node.meta:
125+
# Skip (De)Quantize nodes that were already processed
126+
pass
127+
elif node.target in functions_converters:
128+
functions_converters[node.target](conversion_context).convert(node)
129+
else:
130+
logger.e(
131+
logger.Code.NOT_IMPLEMENTED,
132+
f"Converter for '{node.target.__name__}' not implemented!",
133+
)
134+
135+
@staticmethod
136+
def map_inputs_to_parameters(edge_program: ExportedProgram) -> dict[str, Parameter]:
137+
"""
138+
Create mapping between program parameters (input nodes & static data nodes) and their names.
139+
140+
:param edge_program: EdgeProgram instance.
141+
:return: Mapping from parameter name to parameter instance.
142+
"""
143+
result_map = {}
144+
145+
for input_spec in edge_program.graph_signature.input_specs:
146+
if input_spec.kind in [InputKind.PARAMETER, InputKind.BUFFER]:
147+
result_map[input_spec.arg.name] = edge_program.state_dict[
148+
input_spec.target
149+
]
150+
151+
return result_map
152+
153+
@staticmethod
154+
def build_conversion_context(
155+
parameters_mapping: dict,
156+
node_formats: dict[Node, NodeFormat],
157+
conversion_config: ConversionConfig = _default_conversion_config,
158+
) -> ConversionContext:
159+
tflite_builder = AtenModelBuilderDirector(
160+
3, "TFLite from EdgeProgram", conversion_config
161+
)
162+
163+
# Add "sentinel" buffer (defined in schema.fbs)
164+
tflite_builder.build_empty_buffer()
165+
166+
context = ConversionContext(
167+
tflite_builder, conversion_config, parameters_mapping, node_formats
168+
)
169+
170+
return context
171+
172+
def _convert_qdq_cluster_q_dq_nodes(
173+
self, nodes: list[Node], conversion_context: ConversionContext
174+
):
175+
"""
176+
Go through program and convert De(Quantize) nodes that are part of the QDQ cluster into
177+
tensors.
178+
179+
:param nodes: Program's nodes.
180+
:param conversion_context: ConversionContext instance.
181+
"""
182+
qdq_q_ops_converters = {
183+
exir_ops.edge.quantized_decomposed.dequantize_per_tensor.default: QDQDequantizeConverter, # noqa F405
184+
exir_ops.edge.quantized_decomposed.quantize_per_tensor.default: QDQQuantizeConverter, # noqa F405
185+
}
186+
187+
for node in nodes:
188+
part_of_qdq_cluster = "cluster" in node.meta
189+
if (
190+
node.op == "call_function"
191+
and node.target in qdq_q_ops_converters
192+
and part_of_qdq_cluster
193+
):
194+
qdq_q_ops_converters[node.target](conversion_context).convert(node)
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Copyright 2024 NXP
2+
#
3+
# This source code is licensed under the BSD-style license found in the
4+
# LICENSE file in the root directory of this source tree.
5+
6+
7+
class ConversionConfig:
8+
9+
def __init__(self, args: dict | None = None):
10+
"""
11+
Conversion configuration passed through command line arguments or gathered during
12+
the conversion process.
13+
14+
:param args: Optional dictionary with conversion arguments. Unknown arguments are ignored.
15+
"""
16+
self.keep_io_format: bool = False
17+
self.skip_shape_inference: bool = False
18+
self.allow_inputs_stripping: bool = True
19+
self.qdq_aware_conversion: bool = True
20+
self.symbolic_dimensions_mapping: dict[str, int] | None = None
21+
self.input_shapes_mapping: dict[str, tuple] | None = None
22+
self.dont_skip_nodes_with_known_outputs: bool = False
23+
self.allow_select_ops: bool = True
24+
self.generate_artifacts_after_failed_shape_inference: bool = True
25+
26+
self.optimization_whitelist: list | None = None
27+
self.optimization_blacklist: list | None = None
28+
29+
self.non_negative_indices: bool = False
30+
self.cast_int64_to_int32: bool = False
31+
self.accept_resize_rounding_error: bool = False
32+
self.ignore_opset_version: bool = False
33+
34+
self.tflite_quantization_integrity_check: bool = True
35+
36+
if args is not None:
37+
for key, value in args.items():
38+
if key in self.__dict__:
39+
setattr(self, key, value)
40+
41+
def __repr__(self):
42+
attrs = []
43+
for attr in self.__dict__:
44+
attrs.append(f"{attr}={getattr(self, attr)}")
45+
46+
return "ConversionConfig[" + ", ".join(attrs) + "]"
47+
48+
49+
class SkipShapeInferenceConfig(ConversionConfig):
50+
51+
def __init__(self):
52+
"""
53+
Conversion config shortcut with disabled shape inference.
54+
"""
55+
super().__init__({"skip_shape_inference": True})
56+
57+
58+
class QDQAwareConfig(ConversionConfig):
59+
60+
def __init__(self):
61+
"""
62+
Conversion config shortcut with QDQ aware conversion enabled.
63+
"""
64+
super().__init__({"qdq_aware_conversion": True})

0 commit comments

Comments
 (0)