You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge pull request opencv#19486 from fpetrogalli:dotprod_fast-3.4
* [hal][neon] Optimize the v_dotprod_fast intrinsics for aarch64.
On Armv8 in AArch64 execution mode, we can skip the sequence
v<op>_<ty>(vget_high_<ty>(x), vget_high_<ty>(y))
in favour of
v<op>_high_<ty>(x, y)
This has better changes for recent compilers to use less data movement
operations and better register allocation. See for example:
https://godbolt.org/z/bPq7vd
* [hal][neon] Fix build failure on armv7.
* [hal][neon] Address review comments in PR.
PR: opencv#19486
* [hal][neon] Define macro to check for the AArch64 execution state of Armv8.
* [hal][neon] Fix macro definition for AArch64.
The fix is needed to prevent warnings when building for Armv7.
0 commit comments