Skip to content

Conversation

@tafk7
Copy link
Collaborator

@tafk7 tafk7 commented Sep 26, 2025

Summary

This PR enables CustomOp registration through Python module namespaces, eliminating the need for manual dictionary registration. CustomOps can now be discovered automatically by importing the module corresponding to the ONNX domain name. The system maintains full backward compatibility with legacy custom_op dictionaries while providing better IDE support and a more Pythonic API.

Key Changes

Core Registry (src/qonnx/custom_op/registry.py)

  • CustomOps now discoverable via module namespace (e.g., getattr(module, op_type))
  • Namespace lookup prioritized with automatic fallback to legacy custom_op dictionaries
  • Domain aliasing system enables cross-package compatibility (add_domain_alias())
  • Added resolve_domain() helper for alias resolution
  • Enhanced error messages with actionable guidance

Performance & Safety Improvements

  • Lazy on-demand discovery with in-memory caching (_OP_REGISTRY)
  • Thread-safe access via RLock
  • Privatized registry state (_OP_REGISTRY, _DOMAIN_ALIASES)

Cleanup

  • Removed add_op_to_domain() and get_ops_in_domain() functions
  • Removed unnecessary __all__ exports from modules
  • Removed unused entrypoint system

Benefits

  • Developer Experience: CustomOps work like normal Python classes with IDE autocomplete
  • Extensibility: External packages register ops by exposing classes in module namespace
  • Simplicity: No manual dictionary management required
  • Performance: Cached lookups after initial discovery
  • Compatibility: Existing custom_op dictionaries continue to work

Migration Path

No code changes required. The system automatically discovers ops from both module namespaces and legacy custom_op dictionaries. New CustomOps can be registered by:

  1. Exporting CustomOp classes in module namespace (preferred, Pythonic)
  2. Adding to legacy custom_op dictionaries (backward compatibility)
  3. Using add_domain_alias() to map custom domains to module paths

  Replace dictionary-based registration with module namespace lookup:
  - Direct attribute lookup in module namespace (preferred)
  - Legacy custom_op dictionary support (backward compatibility)
  - Domain aliasing (e.g., onnx.brevitas -> qonnx.custom_op.general)
  - Runtime op registration via add_op_to_domain()
  - No expensive fallbacks (removed case-insensitive matching, full module scan)
  - Clear, actionable error messages

  Benefits:
  - Better IDE support with direct imports
  - Cleaner, more Pythonic API
  - O(1) lookup performance
  - Predictable behavior
  - External packages work via domain-based imports (e.g., finn.custom_op.fpgadataflow)
@jurevreca12 jurevreca12 mentioned this pull request Sep 29, 2025
@jurevreca12 jurevreca12 self-requested a review October 5, 2025 18:22
@jurevreca12
Copy link
Collaborator

How are these ops in the custom domain suppose to execute?
qonnx/core/onnx_exec.py uses an if statement to check for custom ops.

if is_finn_op(node.domain):
    ...

However, the function is_finn_op remains unchanged. It still checks for the "classic" domains:

def is_finn_op(op_type):
    return op_type.startswith("finn") or op_type.startswith("qonnx.custom_op") or op_type.startswith("onnx.brevitas")

I also tested the code on chisel4ml, with a custom defined in the chisel4ml.preprocess namespace. I then called
qonnx.custom_op.registry.add_domain_alias("chisel4ml.preprocess", "qonnx.custom_op.general"). However, the if statement described above did not trigger and I get an onnx runtime error.

Am I missing something? should I be calling something else? Honestly, I just don't see how this can execute.

Example:
add_op_to_domain("qonnx.custom_op.general", "TestOp", TestOp)
"""
if not inspect.isclass(op_class) or not issubclass(op_class, CustomOp):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is not correct. It evaluates to True even if op_class is not subclass of CustomOp. This code will evaluate to True on anything, even if op_class is a float.

Regardless of that, I do not see the point of using inspect here. If the intention is to prevent the user from using an object instance here, issubclass will raise a TypeError. Which is a more appropriate error than ValueError.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is from an old iteration where it cached all CustomOps at a given domain on first access, this avoided illegal comparisons. Thanks for catching this, will remove in next commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not resolved.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tafk7 could you address this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay, resolved in latest commit

@tafk7
Copy link
Collaborator Author

tafk7 commented Oct 11, 2025

However, the function is_finn_op remains unchanged. It still checks for the "classic" domains:

def is_finn_op(op_type):
    return op_type.startswith("finn") or op_type.startswith("qonnx.custom_op") or op_type.startswith("onnx.brevitas")

I also tested the code on chisel4ml, with a custom defined in the chisel4ml.preprocess namespace. I then called qonnx.custom_op.registry.add_domain_alias("chisel4ml.preprocess", "qonnx.custom_op.general"). However, the if statement described above did not trigger and I get an onnx runtime error.

Am I missing something? should I be calling something else? Honestly, I just don't see how this can execute.

@jurevreca12 Could you clarify the expected behavior of the "Brevitas exception?" Since it was only applied in the registry, I assumed similar aliasing should only be used for path resolution when the node's domain doesn't match its import path. Therefore, it's proper for is_finn_op to be directly applied to the node's domain attribute, instead of resolving any aliasing. Instead, should is_finn_op always compare against the module's true path?

tafk7 added 3 commits October 11, 2025 17:05
  - Replace eager module inspection with on-demand discovery
  - Add RLock for thread-safe registry access
  - Cache discovered ops in _OP_REGISTRY dict
  - Remove get_ops_in_domain() and add_op_to_domain()
  - Privatize registry state variables
Replace hard-coded domain string checks with registry-based lookups.
Add is_custom_op() function to registry.py that checks if a domain
or specific (domain, op_type) combination exists via registry lookup
and module import fallback.

Changes:
- Add is_custom_op(domain, op_type=None) to custom_op/registry.py
- Deprecate hasCustomOp in favor of is_custom_op
- Replace is_finn_op implementation with deprecation wrapper
- Update all internal uses across 7 files:
  * modelwrapper.py: get_finn_nodes/get_non_finn_nodes
  * onnx_exec.py: execute_node domain check
  * infer_shapes.py: _hide_finn_ops and assertion
  * infer_data_layouts.py: _dims_to_layout and _infer_node_data_layout
  * infer_datatypes.py: _infer_node_datatype

is_custom_op handles empty domains (standard ONNX) and provides
precise op checking when op_type is specified.
@maltanar
Copy link
Collaborator

maltanar commented Oct 21, 2025

I think the easiest path forward for is_finn_op would be something like return domain not in ["", "ai.onnx"] i.e. just return False when the domain is not the default ONNX domain and try to handle that as a custom op. if the domain does not exist in the registry, just let it error out during that later API call to retrieve the op wrapper. if we can avoid importing the registry module from the basic module and introduce that dependency that would be a good thing.

the Brevitas exception is there for legacy reasons, from an (ancient) time when QONNX hadn't become its own standalone thing and Brevitas exported custom ops with its own domain string. then we decided to make QONNX its own standard. getCustomOp originally contained the domain aliasing logic to specifically replace onnx.brevitas with qonnx.custom_op.general. this PR introduces its own aliasing logic which should still be fine.

as a side note: today everything will be exported with qonnx.custom_op.general in Brevitas: link

@jurevreca12
Copy link
Collaborator

Sorry, for the slow responses. @maltanar What is the problem with importing the registry module? Although the simple is_finn_op logic in the form: return domain not in ["", "ai.onnx"] would work for most cases, it could become a problem if someone wanted to use custom onnx nodes (nodes that are not CustomOp). But I am not sure if anyone is actually using them.

@jurevreca12
Copy link
Collaborator

So, I've tested this latest commit on my chisel4ml code base and it works fine. I think everything looks good, with the exception of that one if statement. I think the is_custom_op logic that checks the registry is an acceptable solution, however I will wait on @maltanar to elaborate on this. Other then that when you fix that if statement, I think we can merge this.

Copy link
Collaborator

@maltanar maltanar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@maltanar
Copy link
Collaborator

This looks ready to go, merging now. Thanks @tafk7 and @jurevreca12 !

@maltanar maltanar merged commit 83db58d into fastmachinelearning:main Oct 30, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants