Skip to content

Commit 0939bd2

Browse files
committed
Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "perf report/top/annotate TUI: - Accept the left arrow key as a Zoom out if done on the first column - Show if source code toggle status in title, to help spotting bugs with the various disassemblers (capstone, llvm, objdump) - Provide feedback on unhandled hotkeys Build: - Better inform when certain features are not available with warnings in the build process and in 'perf version --build-options' or 'perf -vv' perf record: - Improve the --off-cpu code by synthesizing events for switch-out -> switch-in intervals using a BPF program. This can be fine tuned using a --off-cpu-thresh knob perf report: - Add 'tgid' sort key perf mem/c2c: - Add 'op', 'cache', 'snoop', 'dtlb' output fields - Add support for 'ldlat' on AMD IBS (Instruction Based Sampling) perf ftrace: - Use process/session specific trace settings instead of messing with the global ftrace knobs perf trace: - Implement syscall summary in BPF - Support --summary-mode=cgroup - Always print return value for syscalls returning a pid - The rseq and set_robust_list don't return a pid, just -errno perf lock contention: - Symbolize zone->lock using BTF - Add -J/--inject-delay option to estimate impact on application performance by optimization of kernel locking behavior perf stat: - Improve hybrid support for the NMI watchdog warning Symbol resolution: - Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust symbols - Improve Rust demangler Hardware tracing: Intel PT: - Fix PEBS-via-PT data_src - Do not default to recording all switch events - Fix pattern matching with python3 on the SQL viewer script arm64: - Fixups for the hip08 hha PMU Vendor events: - Update Intel events/metrics files for alderlake, alderlaken, arrowlake, bonnell, broadwell, broadwellde, broadwellx, cascadelakex, clearwaterforest, elkhartlake, emeraldrapids, grandridge, graniterapids, haswell, haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep, nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest, skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp, westmereep-sx python support: - Add support for event counts in the python binding, add a counting.py example perf list: - Display the PMU name associated with a perf metric in JSON perf test: - Hybrid improvements for metric value validation test - Fix LBR test by ignoring idle task - Add AMD IBS sw filter ana d'ldlat' tests - Add 'perf trace --summary-mode=cgroup' test - Add tests for the various language symbol demanglers Miscellaneous: - Allow specifying the cpu an event will be tied using '-e event/cpu=N/' - Sync various headers with the kernel sources - Add annotations to use clang's -Wthread-safety and fix some problems it detected - Make dump_stack() use perf's symbol resolution to provide better backtraces - Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS (Precision Event-Based Sampling), that adds timing info, the retirement latency of instructions - Various memory allocation (some detected by ASAN) and reference counting fixes - Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace PERF_RECORD_COMPRESSED - Skip unsupported event types in perf.data files, don't stop when finding one - Improve lookups using hashmaps and binary searches" * tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits) perf callchain: Always populate the addr_location map when adding IP perf lock contention: Reject more than 10ms delays for safety perf trace: Set errpid to false for rseq and set_robust_list perf symbol: Move demangling code out of symbol-elf.c perf trace: Always print return value for syscalls returning a pid perf script: Print PERF_AUX_FLAG_COLLISION flag perf mem: Show absolute percent in mem_stat output perf mem: Display sort order only if it's available perf mem: Describe overhead calculation in brief perf record: Fix incorrect --user-regs comments Revert "perf thread: Ensure comm_lock held for comm_list" perf test trace_summary: Skip --bpf-summary tests if no libbpf perf test intel-pt: Skip jitdump test if no libelf perf intel-tpebs: Avoid race when evlist is being deleted perf test demangle-java: Don't segv if demangling fails perf symbol: Fix use-after-free in filename__read_build_id perf pmu: Avoid segv for missing name/alias_name in wildcarding perf machine: Factor creating a "live" machine out of dwarf-unwind perf test: Add AMD IBS sw filter test perf mem: Count L2 HITM for c2c statistic ...
2 parents 70087d2 + a913ef6 commit 0939bd2

File tree

323 files changed

+22088
-11905
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

323 files changed

+22088
-11905
lines changed

MAINTAINERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10860,6 +10860,7 @@ W: http://www.hisilicon.com
1086010860
F: Documentation/admin-guide/perf/hisi-pcie-pmu.rst
1086110861
F: Documentation/admin-guide/perf/hisi-pmu.rst
1086210862
F: drivers/perf/hisilicon
10863+
F: tools/perf/pmu-events/arch/arm64/hisilicon/
1086310864

1086410865
HISILICON PTT DRIVER
1086510866
M: Yicong Yang <yangyicong@hisilicon.com>

tools/arch/arm64/include/asm/cputype.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,7 @@
129129
#define FUJITSU_CPU_PART_A64FX 0x001
130130

131131
#define HISI_CPU_PART_TSV110 0xD01
132+
#define HISI_CPU_PART_HIP12 0xD06
132133

133134
#define APPLE_CPU_PART_M1_ICESTORM 0x022
134135
#define APPLE_CPU_PART_M1_FIRESTORM 0x023
@@ -202,6 +203,7 @@
202203
#define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_CARMEL)
203204
#define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX)
204205
#define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110)
206+
#define MIDR_HISI_HIP12 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_HIP12)
205207
#define MIDR_APPLE_M1_ICESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM)
206208
#define MIDR_APPLE_M1_FIRESTORM MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_FIRESTORM)
207209
#define MIDR_APPLE_M1_ICESTORM_PRO MIDR_CPU_MODEL(ARM_CPU_IMP_APPLE, APPLE_CPU_PART_M1_ICESTORM_PRO)

tools/arch/x86/include/asm/cpufeatures.h

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@
7575
#define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* "centaur_mcr" Centaur MCRs (= MTRRs) */
7676
#define X86_FEATURE_K8 ( 3*32+ 4) /* Opteron, Athlon64 */
7777
#define X86_FEATURE_ZEN5 ( 3*32+ 5) /* CPU based on Zen5 microarchitecture */
78-
/* Free ( 3*32+ 6) */
78+
#define X86_FEATURE_ZEN6 ( 3*32+ 6) /* CPU based on Zen6 microarchitecture */
7979
/* Free ( 3*32+ 7) */
8080
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* "constant_tsc" TSC ticks at a constant rate */
8181
#define X86_FEATURE_UP ( 3*32+ 9) /* "up" SMP kernel running on UP */
@@ -482,6 +482,7 @@
482482
#define X86_FEATURE_AMD_HTR_CORES (21*32+ 6) /* Heterogeneous Core Topology */
483483
#define X86_FEATURE_AMD_WORKLOAD_CLASS (21*32+ 7) /* Workload Classification */
484484
#define X86_FEATURE_PREFER_YMM (21*32+ 8) /* Avoid ZMM registers due to downclocking */
485+
#define X86_FEATURE_INDIRECT_THUNK_ITS (21*32+ 9) /* Use thunk for indirect branches in lower half of cacheline */
485486

486487
/*
487488
* BUG word(s)
@@ -534,4 +535,6 @@
534535
#define X86_BUG_BHI X86_BUG( 1*32+ 3) /* "bhi" CPU is affected by Branch History Injection */
535536
#define X86_BUG_IBPB_NO_RET X86_BUG( 1*32+ 4) /* "ibpb_no_ret" IBPB omits return target predictions */
536537
#define X86_BUG_SPECTRE_V2_USER X86_BUG( 1*32+ 5) /* "spectre_v2_user" CPU is affected by Spectre variant 2 attack between user processes */
538+
#define X86_BUG_ITS X86_BUG( 1*32+ 6) /* "its" CPU is affected by Indirect Target Selection */
539+
#define X86_BUG_ITS_NATIVE_ONLY X86_BUG( 1*32+ 7) /* "its_native_only" CPU is affected by ITS, VMX is not affected */
537540
#endif /* _ASM_X86_CPUFEATURES_H */

tools/arch/x86/include/asm/msr-index.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -211,6 +211,14 @@
211211
* VERW clears CPU Register
212212
* File.
213213
*/
214+
#define ARCH_CAP_ITS_NO BIT_ULL(62) /*
215+
* Not susceptible to
216+
* Indirect Target Selection.
217+
* This bit is not set by
218+
* HW, but is synthesized by
219+
* VMMs for guests to know
220+
* their affected status.
221+
*/
214222

215223
#define MSR_IA32_FLUSH_CMD 0x0000010b
216224
#define L1D_FLUSH BIT(0) /*

tools/build/Makefile.feature

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,6 @@ FEATURE_TESTS_BASIC := \
8787
libtracefs \
8888
libcpupower \
8989
libcrypto \
90-
libunwind \
9190
pthread-attr-setaffinity-np \
9291
pthread-barrier \
9392
reallocarray \
@@ -148,15 +147,12 @@ endif
148147
FEATURE_DISPLAY ?= \
149148
libdw \
150149
glibc \
151-
libbfd \
152-
libbfd-buildid \
153150
libelf \
154151
libnuma \
155152
numa_num_possible_cpus \
156153
libperl \
157154
libpython \
158155
libcrypto \
159-
libunwind \
160156
libcapstone \
161157
llvm-perf \
162158
zlib \

tools/include/linux/bits.h

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,8 @@
2020
*/
2121
#if !defined(__ASSEMBLY__)
2222
#include <linux/build_bug.h>
23-
#define GENMASK_INPUT_CHECK(h, l) \
24-
(BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
25-
__is_constexpr((l) > (h)), (l) > (h), 0)))
23+
#include <linux/compiler.h>
24+
#define GENMASK_INPUT_CHECK(h, l) BUILD_BUG_ON_ZERO(const_true((l) > (h)))
2625
#else
2726
/*
2827
* BUILD_BUG_ON_ZERO is not available in h files included from asm files,

tools/include/linux/compiler.h

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,28 @@
8181
#define __is_constexpr(x) \
8282
(sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))
8383

84+
/*
85+
* Similar to statically_true() but produces a constant expression
86+
*
87+
* To be used in conjunction with macros, such as BUILD_BUG_ON_ZERO(),
88+
* which require their input to be a constant expression and for which
89+
* statically_true() would otherwise fail.
90+
*
91+
* This is a trade-off: const_true() requires all its operands to be
92+
* compile time constants. Else, it would always returns false even on
93+
* the most trivial cases like:
94+
*
95+
* true || non_const_var
96+
*
97+
* On the opposite, statically_true() is able to fold more complex
98+
* tautologies and will return true on expressions such as:
99+
*
100+
* !(non_const_var * 8 % 4)
101+
*
102+
* For the general case, statically_true() is better.
103+
*/
104+
#define const_true(x) __builtin_choose_expr(__is_constexpr(x), x, false)
105+
84106
#ifdef __ANDROID__
85107
/*
86108
* FIXME: Big hammer to get rid of tons of:

tools/include/uapi/linux/bits.h

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,9 @@
44
#ifndef _UAPI_LINUX_BITS_H
55
#define _UAPI_LINUX_BITS_H
66

7-
#define __GENMASK(h, l) \
8-
(((~_UL(0)) - (_UL(1) << (l)) + 1) & \
9-
(~_UL(0) >> (__BITS_PER_LONG - 1 - (h))))
7+
#define __GENMASK(h, l) (((~_UL(0)) << (l)) & (~_UL(0) >> (BITS_PER_LONG - 1 - (h))))
108

11-
#define __GENMASK_ULL(h, l) \
12-
(((~_ULL(0)) - (_ULL(1) << (l)) + 1) & \
13-
(~_ULL(0) >> (__BITS_PER_LONG_LONG - 1 - (h))))
9+
#define __GENMASK_ULL(h, l) (((~_ULL(0)) << (l)) & (~_ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h))))
1410

1511
#define __GENMASK_U128(h, l) \
1612
((_BIT128((h)) << 1) - (_BIT128(l)))

tools/include/vdso/unaligned.h

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@
22
#ifndef __VDSO_UNALIGNED_H
33
#define __VDSO_UNALIGNED_H
44

5-
#define __get_unaligned_t(type, ptr) ({ \
6-
const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \
7-
__pptr->x; \
5+
#define __get_unaligned_t(type, ptr) ({ \
6+
const struct { type x; } __packed * __get_pptr = (typeof(__get_pptr))(ptr); \
7+
__get_pptr->x; \
88
})
99

10-
#define __put_unaligned_t(type, val, ptr) do { \
11-
struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \
12-
__pptr->x = (val); \
10+
#define __put_unaligned_t(type, val, ptr) do { \
11+
struct { type x; } __packed * __put_pptr = (typeof(__put_pptr))(ptr); \
12+
__put_pptr->x = (val); \
1313
} while (0)
1414

1515
#endif /* __VDSO_UNALIGNED_H */

tools/lib/perf/Documentation/libperf.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,7 @@ SYNOPSIS
210210
struct perf_record_time_conv;
211211
struct perf_record_header_feature;
212212
struct perf_record_compressed;
213+
struct perf_record_compressed2;
213214
--
214215

215216
DESCRIPTION

0 commit comments

Comments
 (0)