Skip to content

Commit 4fdf248

Browse files
committed
Add comment to vector_mask_to_bitmask explaining why shifting before truncation is beneficial
1 parent 65d8a21 commit 4fdf248

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

compiler/rustc_codegen_llvm/src/intrinsic.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -965,6 +965,19 @@ fn generic_simd_intrinsic<'ll, 'tcx>(
965965
}};
966966
}
967967

968+
/// Converts a vector mask, where each element has a bit width equal to the data elements it is used with,
969+
/// down to an i1 based mask that can be used by llvm intrinsics.
970+
///
971+
/// The rust simd semantics are that each element should either consist of all ones or all zeroes,
972+
/// but this information is not available to llvm. Truncating the vector effectively uses the lowest bit,
973+
/// but codegen for several targets is better if we consider the highest bit by shifting.
974+
///
975+
/// For x86 SSE/AVX targets this is beneficial since most instructions with mask parameters only consider the highest bit.
976+
/// So even though on llvm level we have an additional shift, in the final assembly there is no shift or truncate and
977+
/// instead the mask can be used as is.
978+
///
979+
/// For aarch64 and other targets there is a benefit because a mask from the sign bit can be more
980+
/// efficiently converted to an all ones / all zeroes mask by comparing whether each element is negative.
968981
fn vector_mask_to_bitmask<'a, 'll, 'tcx>(
969982
bx: &mut Builder<'a, 'll, 'tcx>,
970983
i_xn: &'ll Value,

0 commit comments

Comments
 (0)