Skip to content

Commit b2763cc

Browse files
committed
Auto merge of #94899 - workingjubilee:bump-simd-clamp, r=workingjubilee
Bump portable-simd to shadow Ord Yon usual bump. Summary for reference: - We are moving away from the subjective "directional" nomenclature, so `horizontal_*` becomes `reduce_*`, et cetera. - In addition, `Simd<Int, N>` now has methods which shadow Ord's methods directly, making those methods behave like the already "overloaded" float methods do.
2 parents 4800c78 + 2b1f249 commit b2763cc

File tree

20 files changed

+214
-105
lines changed

20 files changed

+214
-105
lines changed

library/core/src/slice/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3536,7 +3536,7 @@ impl<T> [T] {
35363536
/// suffix.iter().copied().sum(),
35373537
/// ]);
35383538
/// let sums = middle.iter().copied().fold(sums, f32x4::add);
3539-
/// sums.horizontal_sum()
3539+
/// sums.reduce_sum()
35403540
/// }
35413541
///
35423542
/// let numbers: Vec<f32> = (1..101).map(|x| x as _).collect();

library/portable-simd/beginners-guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ SIMD has a few special vocabulary terms you should know:
3333

3434
* **Vertical:** When an operation is "vertical", each lane processes individually without regard to the other lanes in the same vector. For example, a "vertical add" between two vectors would add lane 0 in `a` with lane 0 in `b`, with the total in lane 0 of `out`, and then the same thing for lanes 1, 2, etc. Most SIMD operations are vertical operations, so if your problem is a vertical problem then you can probably solve it with SIMD.
3535

36-
* **Horizontal:** When an operation is "horizontal", the lanes within a single vector interact in some way. A "horizontal add" might add up lane 0 of `a` with lane 1 of `a`, with the total in lane 0 of `out`.
36+
* **Reducing/Reduce:** When an operation is "reducing" (functions named `reduce_*`), the lanes within a single vector are merged using some operation such as addition, returning the merged value as a scalar. For instance, a reducing add would return the sum of all the lanes' values.
3737

3838
* **Target Feature:** Rust calls a CPU architecture extension a `target_feature`. Proper SIMD requires various CPU extensions to be enabled (details below). Don't confuse this with `feature`, which is a Cargo crate concept.
3939

@@ -83,4 +83,4 @@ Fortunately, most SIMD types have a fairly predictable size. `i32x4` is bit-equi
8383
However, this is not the same as alignment. Computer architectures generally prefer aligned accesses, especially when moving data between memory and vector registers, and while some support specialized operations that can bend the rules to help with this, unaligned access is still typically slow, or even undefined behavior. In addition, different architectures can require different alignments when interacting with their native SIMD types. For this reason, any `#[repr(simd)]` type has a non-portable alignment. If it is necessary to directly interact with the alignment of these types, it should be via [`mem::align_of`].
8484

8585
[`mem::transmute`]: https://doc.rust-lang.org/core/mem/fn.transmute.html
86-
[`mem::align_of`]: https://doc.rust-lang.org/core/mem/fn.align_of.html
86+
[`mem::align_of`]: https://doc.rust-lang.org/core/mem/fn.align_of.html

library/portable-simd/crates/core_simd/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ categories = ["hardware-support", "no-std"]
99
license = "MIT OR Apache-2.0"
1010

1111
[features]
12-
default = ["std", "generic_const_exprs"]
12+
default = []
1313
std = []
1414
generic_const_exprs = []
1515

library/portable-simd/crates/core_simd/examples/matrix_inversion.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ pub fn simd_inv4x4(m: Matrix4x4) -> Option<Matrix4x4> {
233233
let det = det.rotate_lanes_right::<2>() + det;
234234
let det = det.reverse().rotate_lanes_right::<2>() + det;
235235

236-
if det.horizontal_sum() == 0. {
236+
if det.reduce_sum() == 0. {
237237
return None;
238238
}
239239
// calculate the reciprocal

library/portable-simd/crates/core_simd/examples/nbody.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -107,10 +107,10 @@ mod nbody {
107107
let mut e = 0.;
108108
for i in 0..N_BODIES {
109109
let bi = &bodies[i];
110-
e += bi.mass * (bi.v * bi.v).horizontal_sum() * 0.5;
110+
e += bi.mass * (bi.v * bi.v).reduce_sum() * 0.5;
111111
for bj in bodies.iter().take(N_BODIES).skip(i + 1) {
112112
let dx = bi.x - bj.x;
113-
e -= bi.mass * bj.mass / (dx * dx).horizontal_sum().sqrt()
113+
e -= bi.mass * bj.mass / (dx * dx).reduce_sum().sqrt()
114114
}
115115
}
116116
e
@@ -134,8 +134,8 @@ mod nbody {
134134
let mut mag = [0.0; N];
135135
for i in (0..N).step_by(2) {
136136
let d2s = f64x2::from_array([
137-
(r[i] * r[i]).horizontal_sum(),
138-
(r[i + 1] * r[i + 1]).horizontal_sum(),
137+
(r[i] * r[i]).reduce_sum(),
138+
(r[i + 1] * r[i + 1]).reduce_sum(),
139139
]);
140140
let dmags = f64x2::splat(dt) / (d2s * d2s.sqrt());
141141
mag[i] = dmags[0];

library/portable-simd/crates/core_simd/examples/spectral_norm.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ fn mult_av(v: &[f64], out: &mut [f64]) {
2020
sum += b / a;
2121
j += 2
2222
}
23-
*out = sum.horizontal_sum();
23+
*out = sum.reduce_sum();
2424
}
2525
}
2626

@@ -38,7 +38,7 @@ fn mult_atv(v: &[f64], out: &mut [f64]) {
3838
sum += b / a;
3939
j += 2
4040
}
41-
*out = sum.horizontal_sum();
41+
*out = sum.reduce_sum();
4242
}
4343
}
4444

library/portable-simd/crates/core_simd/src/comparisons.rs

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,3 +66,55 @@ where
6666
unsafe { Mask::from_int_unchecked(intrinsics::simd_ge(self, other)) }
6767
}
6868
}
69+
70+
macro_rules! impl_ord_methods_vector {
71+
{ $type:ty } => {
72+
impl<const LANES: usize> Simd<$type, LANES>
73+
where
74+
LaneCount<LANES>: SupportedLaneCount,
75+
{
76+
/// Returns the lane-wise minimum with `other`.
77+
#[must_use = "method returns a new vector and does not mutate the original value"]
78+
#[inline]
79+
pub fn min(self, other: Self) -> Self {
80+
self.lanes_gt(other).select(other, self)
81+
}
82+
83+
/// Returns the lane-wise maximum with `other`.
84+
#[must_use = "method returns a new vector and does not mutate the original value"]
85+
#[inline]
86+
pub fn max(self, other: Self) -> Self {
87+
self.lanes_lt(other).select(other, self)
88+
}
89+
90+
/// Restrict each lane to a certain interval.
91+
///
92+
/// For each lane, returns `max` if `self` is greater than `max`, and `min` if `self` is
93+
/// less than `min`. Otherwise returns `self`.
94+
///
95+
/// # Panics
96+
///
97+
/// Panics if `min > max` on any lane.
98+
#[must_use = "method returns a new vector and does not mutate the original value"]
99+
#[inline]
100+
pub fn clamp(self, min: Self, max: Self) -> Self {
101+
assert!(
102+
min.lanes_le(max).all(),
103+
"each lane in `min` must be less than or equal to the corresponding lane in `max`",
104+
);
105+
self.max(min).min(max)
106+
}
107+
}
108+
}
109+
}
110+
111+
impl_ord_methods_vector!(i8);
112+
impl_ord_methods_vector!(i16);
113+
impl_ord_methods_vector!(i32);
114+
impl_ord_methods_vector!(i64);
115+
impl_ord_methods_vector!(isize);
116+
impl_ord_methods_vector!(u8);
117+
impl_ord_methods_vector!(u16);
118+
impl_ord_methods_vector!(u32);
119+
impl_ord_methods_vector!(u64);
120+
impl_ord_methods_vector!(usize);

library/portable-simd/crates/core_simd/src/intrinsics.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@
1818
//!
1919
//! Unless stated otherwise, all intrinsics for binary operations require SIMD vectors of equal types and lengths.
2020
21-
2221
// These intrinsics aren't linked directly from LLVM and are mostly undocumented, however they are
2322
// mostly lowered to the matching LLVM instructions by the compiler in a fairly straightforward manner.
2423
// The associated LLVM instruction or intrinsic is documented alongside each Rust intrinsic function.
@@ -130,6 +129,14 @@ extern "platform-intrinsic" {
130129
pub(crate) fn simd_reduce_xor<T, U>(x: T) -> U;
131130

132131
// truncate integer vector to bitmask
132+
// `fn simd_bitmask(vector) -> unsigned integer` takes a vector of integers and
133+
// returns either an unsigned integer or array of `u8`.
134+
// Every element in the vector becomes a single bit in the returned bitmask.
135+
// If the vector has less than 8 lanes, a u8 is returned with zeroed trailing bits.
136+
// The bit order of the result depends on the byte endianness. LSB-first for little
137+
// endian and MSB-first for big endian.
138+
//
139+
// UB if called on a vector with values other than 0 and -1.
133140
#[allow(unused)]
134141
pub(crate) fn simd_bitmask<T, U>(x: T) -> U;
135142

library/portable-simd/crates/core_simd/src/lib.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
1-
#![cfg_attr(not(feature = "std"), no_std)]
1+
#![no_std]
22
#![feature(
3-
const_fn_trait_bound,
43
convert_float_to_int,
54
decl_macro,
65
intra_doc_pointers,

library/portable-simd/crates/core_simd/src/masks/to_bitmask.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ macro_rules! impl_integer_intrinsic {
5050
}
5151

5252
impl_integer_intrinsic! {
53+
unsafe impl ToBitMask<BitMask=u8> for Mask<_, 1>
54+
unsafe impl ToBitMask<BitMask=u8> for Mask<_, 2>
55+
unsafe impl ToBitMask<BitMask=u8> for Mask<_, 4>
5356
unsafe impl ToBitMask<BitMask=u8> for Mask<_, 8>
5457
unsafe impl ToBitMask<BitMask=u16> for Mask<_, 16>
5558
unsafe impl ToBitMask<BitMask=u32> for Mask<_, 32>

0 commit comments

Comments
 (0)