Skip to content

Conversation

pratham-mcw
Copy link
Contributor

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.

  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV

  • The PR is proposed to the proper branch

  • This PR introduces an ARM64-specific performance optimization in AdaptiveManifoldFilter::h_filter by applying loop unrolling.

  • The optimization is guarded with #if defined(_M_ARM64) to ensure it only affects ARM64 builds.

  • The optimization does not affect accuracy and maintains the same numerical behavior as the original scalar implementation.

Performance Improvements :

  • The optimization significantly improves the performance of adaptive Manifold function on Windows ARM64 targets.
  • The table below shows timing comparisons before and after the optimization:
image

@pratham-mcw
Copy link
Contributor Author

Hi @asmorkalov, just following up on this PR. The loop unrolling in the h_filter function has shown significant performance improvements on Windows-ARM64 targets, while maintaining functional correctness across other architectures.
Please let me know if there are any additional changes or updates you’d like me to make.

@asmorkalov
Copy link
Contributor

I do not see stable improvement on Linux ARM (GCC, Jetson Orin). SO let's merge as is without check extension.

@asmorkalov asmorkalov self-assigned this Oct 13, 2025
@asmorkalov asmorkalov merged commit 06fc7ad into opencv:4.x Oct 13, 2025
11 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants