Skip to content

Commit 21fe251

Browse files
ubizjakIngo Molnar
authored andcommitted
x86/hweight: Use asm_inline() instead of asm()
Use asm_inline() to instruct the compiler that the size of asm() is the minimum size of one instruction, ignoring how many instructions the compiler thinks it is. ALTERNATIVE macro that expands to several pseudo directives causes instruction length estimate to count more than 20 instructions. bloat-o-meter reports slight reduction of the code size for x86_64 defconfig object file, compiled with gcc-14.2: add/remove: 6/12 grow/shrink: 59/50 up/down: 3389/-3560 (-171) Total: Before=22734393, After=22734222, chg -0.00% where 29 instances of code blocks involving POPCNT now gets inlined, resulting in the removal of several functions: format_is_yuv_semiplanar.part.isra 41 - -41 cdclk_divider 69 - -69 intel_joiner_adjust_timings 140 - -140 nl80211_send_wowlan_tcp_caps 369 - -369 nl80211_send_iftype_data 579 - -579 __do_sys_pidfd_send_signal 809 - -809 One noticeable change is: pcpu_page_first_chunk 1075 1060 -15 Where the compiler now inlines 4 more instances of POPCNT insns, but still manages to compile to a function with smaller code size. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250312123905.149298-3-ubizjak@gmail.com
1 parent 194a613 commit 21fe251

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

arch/x86/include/asm/arch_hweight.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@ static __always_inline unsigned int __arch_hweight32(unsigned int w)
1616
{
1717
unsigned int res;
1818

19-
asm (ALTERNATIVE("call __sw_hweight32", "popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
19+
asm_inline (ALTERNATIVE("call __sw_hweight32",
20+
"popcntl %[val], %[cnt]", X86_FEATURE_POPCNT)
2021
: [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
2122
: [val] REG_IN (w));
2223

@@ -44,7 +45,8 @@ static __always_inline unsigned long __arch_hweight64(__u64 w)
4445
{
4546
unsigned long res;
4647

47-
asm (ALTERNATIVE("call __sw_hweight64", "popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
48+
asm_inline (ALTERNATIVE("call __sw_hweight64",
49+
"popcntq %[val], %[cnt]", X86_FEATURE_POPCNT)
4850
: [cnt] "=" REG_OUT (res), ASM_CALL_CONSTRAINT
4951
: [val] REG_IN (w));
5052

0 commit comments

Comments
 (0)