Skip to content

Commit b70100f

Browse files
committed
Merge tag 'probes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu: - kprobes: use struct_size() for variable size kretprobe_instance data structure. - eprobe: Simplify trace_eprobe list iteration. - probe events: Data structure field access support on BTF argument. - Update BTF argument support on the functions in the kernel loadable modules (only loaded modules are supported). - Move generic BTF access function (search function prototype and get function parameters) to a separated file. - Add a function to search a member of data structure in BTF. - Support accessing BTF data structure member from probe args by C-like arrow('->') and dot('.') operators. e.g. 't sched_switch next=next->pid vruntime=next->se.vruntime' - Support accessing BTF data structure member from $retval. e.g. 'f getname_flags%return +0($retval->name):string' - Add string type checking if BTF type info is available. This will reject if user specify ":string" type for non "char pointer" type. - Automatically assume the fprobe event as a function return event if $retval is used. - selftests/ftrace: Add BTF data field access test cases. - Documentation: Update fprobe event example with BTF data field. * tag 'probes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: Documentation: tracing: Update fprobe event example with BTF field selftests/ftrace: Add BTF fields access testcases tracing/fprobe-event: Assume fprobe is a return event by $retval tracing/probes: Add string type check with BTF tracing/probes: Support BTF field access from $retval tracing/probes: Support BTF based data structure field access tracing/probes: Add a function to search a member of a struct/union tracing/probes: Move finding func-proto API and getting func-param API to trace_btf tracing/probes: Support BTF argument on module functions tracing/eprobe: Iterate trace_eprobe directly kernel: kprobes: Use struct_size()
2 parents e021c5f + a2439a4 commit b70100f

File tree

16 files changed

+661
-188
lines changed

16 files changed

+661
-188
lines changed

Documentation/trace/fprobetrace.rst

Lines changed: 46 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -79,9 +79,9 @@ automatically set by the given name. ::
7979
f:fprobes/myprobe vfs_read count=count pos=pos
8080

8181
It also chooses the fetch type from BTF information. For example, in the above
82-
example, the ``count`` is unsigned long, and the ``pos`` is a pointer. Thus, both
83-
are converted to 64bit unsigned long, but only ``pos`` has "%Lx" print-format as
84-
below ::
82+
example, the ``count`` is unsigned long, and the ``pos`` is a pointer. Thus,
83+
both are converted to 64bit unsigned long, but only ``pos`` has "%Lx"
84+
print-format as below ::
8585

8686
# cat events/fprobes/myprobe/format
8787
name: myprobe
@@ -105,9 +105,47 @@ is expanded to all function arguments of the function or the tracepoint. ::
105105
# cat dynamic_events
106106
f:fprobes/myprobe vfs_read file=file buf=buf count=count pos=pos
107107

108-
BTF also affects the ``$retval``. If user doesn't set any type, the retval type is
109-
automatically picked from the BTF. If the function returns ``void``, ``$retval``
110-
is rejected.
108+
BTF also affects the ``$retval``. If user doesn't set any type, the retval
109+
type is automatically picked from the BTF. If the function returns ``void``,
110+
``$retval`` is rejected.
111+
112+
You can access the data fields of a data structure using allow operator ``->``
113+
(for pointer type) and dot operator ``.`` (for data structure type.)::
114+
115+
# echo 't sched_switch preempt prev_pid=prev->pid next_pid=next->pid' >> dynamic_events
116+
117+
The field access operators, ``->`` and ``.`` can be combined for accessing deeper
118+
members and other structure members pointed by the member. e.g. ``foo->bar.baz->qux``
119+
If there is non-name union member, you can directly access it as the C code does.
120+
For example::
121+
122+
struct {
123+
union {
124+
int a;
125+
int b;
126+
};
127+
} *foo;
128+
129+
To access ``a`` and ``b``, use ``foo->a`` and ``foo->b`` in this case.
130+
131+
This data field access is available for the return value via ``$retval``,
132+
e.g. ``$retval->name``.
133+
134+
For these BTF arguments and fields, ``:string`` and ``:ustring`` change the
135+
behavior. If these are used for BTF argument or field, it checks whether
136+
the BTF type of the argument or the data field is ``char *`` or ``char []``,
137+
or not. If not, it rejects applying the string types. Also, with the BTF
138+
support, you don't need a memory dereference operator (``+0(PTR)``) for
139+
accessing the string pointed by a ``PTR``. It automatically adds the memory
140+
dereference operator according to the BTF type. e.g. ::
141+
142+
# echo 't sched_switch prev->comm:string' >> dynamic_events
143+
# echo 'f getname_flags%return $retval->name:string' >> dynamic_events
144+
145+
The ``prev->comm`` is an embedded char array in the data structure, and
146+
``$retval->name`` is a char pointer in the data structure. But in both
147+
cases, you can use ``:string`` type to get the string.
148+
111149

112150
Usage examples
113151
--------------
@@ -161,10 +199,10 @@ parameters. This means you can access any field values in the task
161199
structure pointed by the ``prev`` and ``next`` arguments.
162200

163201
For example, usually ``task_struct::start_time`` is not traced, but with this
164-
traceprobe event, you can trace it as below.
202+
traceprobe event, you can trace that field as below.
165203
::
166204

167-
# echo 't sched_switch comm=+1896(next):string start_time=+1728(next):u64' > dynamic_events
205+
# echo 't sched_switch comm=next->comm:string next->start_time' > dynamic_events
168206
# head -n 20 trace | tail
169207
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
170208
# | | | ||||| | |
@@ -176,13 +214,3 @@ traceprobe event, you can trace it as below.
176214
<idle>-0 [000] d..3. 5606.690317: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000
177215
kworker/0:1-14 [000] d..3. 5606.690339: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0
178216
<idle>-0 [000] d..3. 5606.692368: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000
179-
180-
Currently, to find the offset of a specific field in the data structure,
181-
you need to build kernel with debuginfo and run `perf probe` command with
182-
`-D` option. e.g.
183-
::
184-
185-
# perf probe -D "__probestub_sched_switch next->comm:string next->start_time"
186-
p:probe/__probestub_sched_switch __probestub_sched_switch+0 comm=+1896(%cx):string start_time=+1728(%cx):u64
187-
188-
And replace the ``%cx`` with the ``next``.

include/linux/btf.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,7 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type
209209
int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec);
210210
bool btf_type_is_void(const struct btf_type *t);
211211
s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
212+
s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p);
212213
const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
213214
u32 id, u32 *res_id);
214215
const struct btf_type *btf_type_resolve_ptr(const struct btf *btf,

kernel/bpf/btf.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -553,7 +553,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
553553
return -ENOENT;
554554
}
555555

556-
static s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
556+
s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
557557
{
558558
struct btf *btf;
559559
s32 ret;

kernel/kprobes.c

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2232,8 +2232,7 @@ int register_kretprobe(struct kretprobe *rp)
22322232
return -ENOMEM;
22332233

22342234
for (i = 0; i < rp->maxactive; i++) {
2235-
inst = kzalloc(sizeof(struct kretprobe_instance) +
2236-
rp->data_size, GFP_KERNEL);
2235+
inst = kzalloc(struct_size(inst, data, rp->data_size), GFP_KERNEL);
22372236
if (inst == NULL) {
22382237
rethook_free(rp->rh);
22392238
rp->rh = NULL;
@@ -2256,8 +2255,7 @@ int register_kretprobe(struct kretprobe *rp)
22562255

22572256
rp->rph->rp = rp;
22582257
for (i = 0; i < rp->maxactive; i++) {
2259-
inst = kzalloc(sizeof(struct kretprobe_instance) +
2260-
rp->data_size, GFP_KERNEL);
2258+
inst = kzalloc(struct_size(inst, data, rp->data_size), GFP_KERNEL);
22612259
if (inst == NULL) {
22622260
refcount_set(&rp->rph->ref, i);
22632261
free_rp_inst(rp);

kernel/trace/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
9999
endif
100100
obj-$(CONFIG_DYNAMIC_EVENTS) += trace_dynevent.o
101101
obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
102+
obj-$(CONFIG_PROBE_EVENTS_BTF_ARGS) += trace_btf.o
102103
obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
103104
obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o
104105
obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o

kernel/trace/trace.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5711,7 +5711,8 @@ static const char readme_msg[] =
57115711
"\t fetcharg: (%<register>|$<efield>), @<address>, @<symbol>[+|-<offset>],\n"
57125712
#ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
57135713
#ifdef CONFIG_PROBE_EVENTS_BTF_ARGS
5714-
"\t $stack<index>, $stack, $retval, $comm, $arg<N>, <argname>\n"
5714+
"\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n"
5715+
"\t <argname>[->field[->field|.field...]],\n"
57155716
#else
57165717
"\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n"
57175718
#endif

kernel/trace/trace_btf.c

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
// SPDX-License-Identifier: GPL-2.0
2+
#include <linux/btf.h>
3+
#include <linux/kernel.h>
4+
#include <linux/slab.h>
5+
6+
#include "trace_btf.h"
7+
8+
/*
9+
* Find a function proto type by name, and return the btf_type with its btf
10+
* in *@btf_p. Return NULL if not found.
11+
* Note that caller has to call btf_put(*@btf_p) after using the btf_type.
12+
*/
13+
const struct btf_type *btf_find_func_proto(const char *func_name, struct btf **btf_p)
14+
{
15+
const struct btf_type *t;
16+
s32 id;
17+
18+
id = bpf_find_btf_id(func_name, BTF_KIND_FUNC, btf_p);
19+
if (id < 0)
20+
return NULL;
21+
22+
/* Get BTF_KIND_FUNC type */
23+
t = btf_type_by_id(*btf_p, id);
24+
if (!t || !btf_type_is_func(t))
25+
goto err;
26+
27+
/* The type of BTF_KIND_FUNC is BTF_KIND_FUNC_PROTO */
28+
t = btf_type_by_id(*btf_p, t->type);
29+
if (!t || !btf_type_is_func_proto(t))
30+
goto err;
31+
32+
return t;
33+
err:
34+
btf_put(*btf_p);
35+
return NULL;
36+
}
37+
38+
/*
39+
* Get function parameter with the number of parameters.
40+
* This can return NULL if the function has no parameters.
41+
* It can return -EINVAL if the @func_proto is not a function proto type.
42+
*/
43+
const struct btf_param *btf_get_func_param(const struct btf_type *func_proto, s32 *nr)
44+
{
45+
if (!btf_type_is_func_proto(func_proto))
46+
return ERR_PTR(-EINVAL);
47+
48+
*nr = btf_type_vlen(func_proto);
49+
if (*nr > 0)
50+
return (const struct btf_param *)(func_proto + 1);
51+
else
52+
return NULL;
53+
}
54+
55+
#define BTF_ANON_STACK_MAX 16
56+
57+
struct btf_anon_stack {
58+
u32 tid;
59+
u32 offset;
60+
};
61+
62+
/*
63+
* Find a member of data structure/union by name and return it.
64+
* Return NULL if not found, or -EINVAL if parameter is invalid.
65+
* If the member is an member of anonymous union/structure, the offset
66+
* of that anonymous union/structure is stored into @anon_offset. Caller
67+
* can calculate the correct offset from the root data structure by
68+
* adding anon_offset to the member's offset.
69+
*/
70+
const struct btf_member *btf_find_struct_member(struct btf *btf,
71+
const struct btf_type *type,
72+
const char *member_name,
73+
u32 *anon_offset)
74+
{
75+
struct btf_anon_stack *anon_stack;
76+
const struct btf_member *member;
77+
u32 tid, cur_offset = 0;
78+
const char *name;
79+
int i, top = 0;
80+
81+
anon_stack = kcalloc(BTF_ANON_STACK_MAX, sizeof(*anon_stack), GFP_KERNEL);
82+
if (!anon_stack)
83+
return ERR_PTR(-ENOMEM);
84+
85+
retry:
86+
if (!btf_type_is_struct(type)) {
87+
member = ERR_PTR(-EINVAL);
88+
goto out;
89+
}
90+
91+
for_each_member(i, type, member) {
92+
if (!member->name_off) {
93+
/* Anonymous union/struct: push it for later use */
94+
type = btf_type_skip_modifiers(btf, member->type, &tid);
95+
if (type && top < BTF_ANON_STACK_MAX) {
96+
anon_stack[top].tid = tid;
97+
anon_stack[top++].offset =
98+
cur_offset + member->offset;
99+
}
100+
} else {
101+
name = btf_name_by_offset(btf, member->name_off);
102+
if (name && !strcmp(member_name, name)) {
103+
if (anon_offset)
104+
*anon_offset = cur_offset;
105+
goto out;
106+
}
107+
}
108+
}
109+
if (top > 0) {
110+
/* Pop from the anonymous stack and retry */
111+
tid = anon_stack[--top].tid;
112+
cur_offset = anon_stack[top].offset;
113+
type = btf_type_by_id(btf, tid);
114+
goto retry;
115+
}
116+
member = NULL;
117+
118+
out:
119+
kfree(anon_stack);
120+
return member;
121+
}
122+

kernel/trace/trace_btf.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#include <linux/btf.h>
3+
4+
const struct btf_type *btf_find_func_proto(const char *func_name,
5+
struct btf **btf_p);
6+
const struct btf_param *btf_get_func_param(const struct btf_type *func_proto,
7+
s32 *nr);
8+
const struct btf_member *btf_find_struct_member(struct btf *btf,
9+
const struct btf_type *type,
10+
const char *member_name,
11+
u32 *anon_offset);

kernel/trace/trace_eprobe.c

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,10 @@ struct eprobe_data {
4141
struct trace_eprobe *ep;
4242
};
4343

44+
45+
#define for_each_trace_eprobe_tp(ep, _tp) \
46+
list_for_each_entry(ep, trace_probe_probe_list(_tp), tp.list)
47+
4448
static int __trace_eprobe_create(int argc, const char *argv[]);
4549

4650
static void trace_event_probe_cleanup(struct trace_eprobe *ep)
@@ -640,7 +644,7 @@ static int disable_eprobe(struct trace_eprobe *ep,
640644
static int enable_trace_eprobe(struct trace_event_call *call,
641645
struct trace_event_file *file)
642646
{
643-
struct trace_probe *pos, *tp;
647+
struct trace_probe *tp;
644648
struct trace_eprobe *ep;
645649
bool enabled;
646650
int ret = 0;
@@ -662,8 +666,7 @@ static int enable_trace_eprobe(struct trace_event_call *call,
662666
if (enabled)
663667
return 0;
664668

665-
list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
666-
ep = container_of(pos, struct trace_eprobe, tp);
669+
for_each_trace_eprobe_tp(ep, tp) {
667670
ret = enable_eprobe(ep, file);
668671
if (ret)
669672
break;
@@ -680,8 +683,7 @@ static int enable_trace_eprobe(struct trace_event_call *call,
680683
*/
681684
WARN_ON_ONCE(ret != -ENOMEM);
682685

683-
list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
684-
ep = container_of(pos, struct trace_eprobe, tp);
686+
for_each_trace_eprobe_tp(ep, tp) {
685687
disable_eprobe(ep, file->tr);
686688
if (!--cnt)
687689
break;
@@ -699,7 +701,7 @@ static int enable_trace_eprobe(struct trace_event_call *call,
699701
static int disable_trace_eprobe(struct trace_event_call *call,
700702
struct trace_event_file *file)
701703
{
702-
struct trace_probe *pos, *tp;
704+
struct trace_probe *tp;
703705
struct trace_eprobe *ep;
704706

705707
tp = trace_probe_primary_from_call(call);
@@ -716,10 +718,8 @@ static int disable_trace_eprobe(struct trace_event_call *call,
716718
trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
717719

718720
if (!trace_probe_is_enabled(tp)) {
719-
list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
720-
ep = container_of(pos, struct trace_eprobe, tp);
721+
for_each_trace_eprobe_tp(ep, tp)
721722
disable_eprobe(ep, file->tr);
722-
}
723723
}
724724

725725
out:
@@ -807,13 +807,11 @@ static int trace_eprobe_tp_update_arg(struct trace_eprobe *ep, const char *argv[
807807
int ret;
808808

809809
ret = traceprobe_parse_probe_arg(&ep->tp, i, argv[i], &ctx);
810-
if (ret)
811-
return ret;
812-
813810
/* Handle symbols "@" */
814811
if (!ret)
815812
ret = traceprobe_update_arg(&ep->tp.args[i]);
816813

814+
traceprobe_finish_parse(&ctx);
817815
return ret;
818816
}
819817

0 commit comments

Comments
 (0)