@@ -66,7 +66,9 @@ Changes to the LLVM IR
66
66
added to describe the mapping between scalar functions and vector
67
67
functions, to enable vectorization of call sites. The information
68
68
provided by the attribute is interfaced via the API provided by the
69
- ``VFDatabase `` class.
69
+ ``VFDatabase `` class. When scanning through the set of vector
70
+ functions associated with a scalar call, the loop vectorizer now
71
+ relies on ``VFDatabase ``, instead of ``TargetLibraryInfo ``.
70
72
71
73
* `dereferenceable ` attributes and metadata on pointers no longer imply
72
74
anything about the alignment of the pointer in question. Previously, some
@@ -78,6 +80,17 @@ Changes to the LLVM IR
78
80
information. This information is used to represent Fortran modules debug
79
81
info at IR level.
80
82
83
+ * LLVM IR now supports two distinct ``llvm::FixedVectorType `` and
84
+ ``llvm::ScalableVectorType `` vector types, both derived from the
85
+ base class ``llvm::VectorType ``. A number of algorithms dealing with
86
+ IR vector types have been updated to make sure they work for both
87
+ scalable and fixed vector types. Where possible, the code has been
88
+ made generic to cover both cases using the base class. Specifically,
89
+ places that were using the type ``unsigned `` to count the number of
90
+ lanes of a vector are now using ``llvm::ElementCount ``. In places
91
+ where ``uint64_t `` was used to denote the size in bits of a IR type
92
+ we have partially migrated the codebase to using ``llvm::TypeSize ``.
93
+
81
94
Changes to building LLVM
82
95
------------------------
83
96
@@ -110,6 +123,55 @@ During this release ...
110
123
default may wish to specify ``-fno-omit-frame-pointer `` to get the old
111
124
behavior. This improves compatibility with GCC.
112
125
126
+ * Clang adds support for the following macros that enable the
127
+ C-intrinsics from the `Arm C language extensions for SVE
128
+ <https://developer.arm.com/documentation/100987/> `_ (version
129
+ ``00bet5 ``, see section 2.1 for the list of intrinsics associated to
130
+ each macro):
131
+
132
+
133
+ ================================= =================
134
+ Preprocessor macro Target feature
135
+ ================================= =================
136
+ ``__ARM_FEATURE_SVE `` ``+sve ``
137
+ ``__ARM_FEATURE_SVE_BF16 `` ``+sve+bf16 ``
138
+ ``__ARM_FEATURE_SVE_MATMUL_FP32 `` ``+sve+f32mm ``
139
+ ``__ARM_FEATURE_SVE_MATMUL_FP64 `` ``+sve+f64mm ``
140
+ ``__ARM_FEATURE_SVE_MATMUL_INT8 `` ``+sve+i8mm ``
141
+ ``__ARM_FEATURE_SVE2 `` ``+sve2 ``
142
+ ``__ARM_FEATURE_SVE2_AES `` ``+sve2-aes ``
143
+ ``__ARM_FEATURE_SVE2_BITPERM `` ``+sve2-bitperm ``
144
+ ``__ARM_FEATURE_SVE2_SHA3 `` ``+sve2-sha3 ``
145
+ ``__ARM_FEATURE_SVE2_SM4 `` ``+sve2-sm4 ``
146
+ ================================= =================
147
+
148
+ The macros enable users to write C/C++ `Vector Length Agnostic
149
+ (VLA) ` loops, that can be executed on any CPU that implements the
150
+ underlying instructions supported by the C intrinsics, independently
151
+ of the hardware vector register size.
152
+
153
+ For example, the ``__ARM_FEATURE_SVE `` macro is enabled when
154
+ targeting AArch64 code generation by setting ``-march=armv8-a+sve ``
155
+ on the command line.
156
+
157
+ .. code-block :: c
158
+ :caption: Example of VLA addition of two arrays with SVE ACLE.
159
+
160
+ // Compile with:
161
+ // `clang++ -march=armv8a+sve ...` (for c++)
162
+ // `clang -stc=c11 -march=armv8a+sve ...` (for c)
163
+ #include <arm_sve.h>
164
+
165
+ void VLA_add_arrays(double *x, double *y, double *out, unsigned N) {
166
+ for (unsigned i = 0; i < N; i += svcntd()) {
167
+ svbool_t Pg = svwhilelt_b64(i, N);
168
+ svfloat64_t vx = svld1(Pg, &x[i]);
169
+ svfloat64_t vy = svld1(Pg, &y[i]);
170
+ svfloat64_t vout = svadd_x(Pg, vx, vy);
171
+ svst1(Pg, &out[i], vout);
172
+ }
173
+ }
174
+
113
175
Changes to the MIPS Target
114
176
--------------------------
115
177
0 commit comments