Skip to content

Implement optimized BF16 support for ARM architecture - [MOD-9079] #623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 6, 2025

Conversation

GuyAv46
Copy link
Collaborator

@GuyAv46 GuyAv46 commented Mar 31, 2025

Describe the changes in the pull request

Implement SVE and NEON (with fp16 fml) ARM optimizations for BFLOAT16

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

@GuyAv46 GuyAv46 changed the base branch from guyav-arm_fp16_support to main April 3, 2025 06:54
@GuyAv46 GuyAv46 force-pushed the guyav-arm_bf16_support branch from 1de69d4 to bb46609 Compare April 3, 2025 06:54
@GuyAv46 GuyAv46 marked this pull request as ready for review April 3, 2025 06:54
Copy link

codecov bot commented Apr 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.51%. Comparing base (b996755) to head (df2d2ca).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #623      +/-   ##
==========================================
- Coverage   97.19%   96.51%   -0.68%     
==========================================
  Files         106      106              
  Lines        5702     5745      +43     
==========================================
+ Hits         5542     5545       +3     
- Misses        160      200      +40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@GuyAv46 GuyAv46 requested review from dor-forer and Copilot April 3, 2025 08:29
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements optimized BFLOAT16 support on ARM by introducing new implementations for both SVE and NEON, along with corresponding test and benchmark updates.

  • Added new SVE_BF16 and NEON_BF16 source and header files that provide BF16 inner product and L2 norm functions.
  • Updated unit tests and benchmarks to cover the new ARM-specific BF16 implementations.
  • Integrated the new implementations into the selection logic in L2_space.cpp and IP_space.cpp based on detected CPU features.

Reviewed Changes

Copilot reviewed 13 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/unit/test_spaces.cpp Added BF16 tests for NEON_BF16 and SVE_BF16 support
tests/benchmark/spaces_benchmarks/bm_spaces_bf16.cpp Added ARM-specific benchmark initialization for BF16 modes
tests/benchmark/spaces_benchmarks/bm_spaces.h Updated header includes for ARM BF16 optimizations
src/VecSim/spaces/functions/SVE_BF16.h & SVE_BF16.cpp New SVE BF16 implementation functions
src/VecSim/spaces/functions/NEON_BF16.h & NEON_BF16.cpp New NEON BF16 implementation functions
src/VecSim/spaces/L2_space.cpp Integrated NEON_BF16 and SVE_BF16 optimizations into L2 function
src/VecSim/spaces/L2/L2_SVE_BF16.h Added SVE BF16 L2 norm implementation
src/VecSim/spaces/L2/L2_NEON_BF16.h Added NEON BF16 L2 norm implementation
src/VecSim/spaces/IP_space.cpp Integrated NEON_BF16 and SVE_BF16 optimizations into inner product func
src/VecSim/spaces/IP/IP_SVE_BF16.h & IP_NEON_BF16.h New BF16 inner product implementations for SVE and NEON
Files not reviewed (2)
  • cmake/aarch64InstructionFlags.cmake: Language not supported
  • src/VecSim/spaces/CMakeLists.txt: Language not supported

dor-forer
dor-forer previously approved these changes Apr 3, 2025
Copy link
Collaborator

@dor-forer dor-forer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@GuyAv46 GuyAv46 requested a review from lerman25 April 3, 2025 15:08
@GuyAv46 GuyAv46 requested a review from dor-forer April 3, 2025 15:08
Copy link
Collaborator

@lerman25 lerman25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@GuyAv46 GuyAv46 added this pull request to the merge queue Apr 6, 2025
Merged via the queue into main with commit bb41732 Apr 6, 2025
24 checks passed
@GuyAv46 GuyAv46 deleted the guyav-arm_bf16_support branch April 6, 2025 16:31
github-actions bot pushed a commit that referenced this pull request Apr 6, 2025
)

* SVE implementation for bf16

* add required build flags and fix implementation

* final fixes and implement benchmarks

* added tests

* implement neon bf16 distance functions

* implement build flow and benchmarks

* added test

* format

* remove redundant check

* typo fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fixes and cleanup

* fix build

* fix svwhilelt_b16 calls

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit bb41732)
Copy link

github-actions bot commented Apr 6, 2025

Successfully created backport PR for 0.8:

github-actions bot pushed a commit that referenced this pull request Apr 6, 2025
)

* SVE implementation for bf16

* add required build flags and fix implementation

* final fixes and implement benchmarks

* added tests

* implement neon bf16 distance functions

* implement build flow and benchmarks

* added test

* format

* remove redundant check

* typo fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fixes and cleanup

* fix build

* fix svwhilelt_b16 calls

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit bb41732)
Copy link

github-actions bot commented Apr 6, 2025

Successfully created backport PR for 8.0:

github-merge-queue bot pushed a commit that referenced this pull request Apr 6, 2025
…79] (#641)

Implement optimized BF16 support for ARM architecture - [MOD-9079] (#623)

* SVE implementation for bf16

* add required build flags and fix implementation

* final fixes and implement benchmarks

* added tests

* implement neon bf16 distance functions

* implement build flow and benchmarks

* added test

* format

* remove redundant check

* typo fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fixes and cleanup

* fix build

* fix svwhilelt_b16 calls

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit bb41732)

Co-authored-by: GuyAv46 <47632673+GuyAv46@users.noreply.github.com>
github-merge-queue bot pushed a commit that referenced this pull request Apr 6, 2025
…79] (#642)

Implement optimized BF16 support for ARM architecture - [MOD-9079] (#623)

* SVE implementation for bf16

* add required build flags and fix implementation

* final fixes and implement benchmarks

* added tests

* implement neon bf16 distance functions

* implement build flow and benchmarks

* added test

* format

* remove redundant check

* typo fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fixes and cleanup

* fix build

* fix svwhilelt_b16 calls

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit bb41732)

Co-authored-by: GuyAv46 <47632673+GuyAv46@users.noreply.github.com>
github-merge-queue bot pushed a commit that referenced this pull request Apr 7, 2025
…79] (#641)

* Implement optimized BF16 support for ARM architecture - [MOD-9079] (#623)

* SVE implementation for bf16

* add required build flags and fix implementation

* final fixes and implement benchmarks

* added tests

* implement neon bf16 distance functions

* implement build flow and benchmarks

* added test

* format

* remove redundant check

* typo fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fixes and cleanup

* fix build

* fix svwhilelt_b16 calls

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit bb41732)

* fix benchmark macros for 0.8

---------

Co-authored-by: GuyAv46 <47632673+GuyAv46@users.noreply.github.com>
Co-authored-by: GuyAv46 <guy.avimor@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants