Skip to content

Commit 88221ac

Browse files
committed
Merge tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull latency tracing updates from Steven Rostedt: - Add some trace events to osnoise and timerlat sample generation This adds more information to the osnoise and timerlat tracers as well as allows BPF programs to be attached to these locations to extract even more data. - Fix to DECLARE_TRACE_CONDITION() macro It wasn't used but now will be and it happened to be broken causing the build to fail. - Add scheduler specification monitors to runtime verifier (RV) This is a continuation of Daniel Bristot's work. RV allows monitors to run and react concurrently. Running the cumulative model is equivalent to running single components using the same reactors, with the advantage that it's easier to point out which specification failed in case of error. This update introduces nested monitors to RV, in short, the sysfs monitor folder will contain a monitor named sched, which is nothing but an empty container for other monitors. Controlling the sched monitor (enable, disable, set reactors) controls all nested monitors. The following scheduling monitors are added: - sco: scheduling context operations Monitor to ensure sched_set_state happens only in thread context - tss: task switch while scheduling Monitor to ensure sched_switch happens only in scheduling context - snroc: set non runnable on its own context Monitor to ensure set_state happens only in the respective task's context - scpd: schedule called with preemption disabled Monitor to ensure schedule is called with preemption disabled - snep: schedule does not enable preempt Monitor to ensure schedule does not enable preempt - sncid: schedule not called with interrupt disabled Monitor to ensure schedule is not called with interrupt disabled * tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tools/rv: Allow rv list to filter for container Documentation/rv: Add docs for the sched monitors verification/dot2k: Add support for nested monitors tools/rv: Add support for nested monitors rv: Add scpd, snep and sncid per-cpu monitors rv: Add snroc per-task monitor rv: Add sco and tss per-cpu monitors rv: Add option for nested monitors and include sched sched: Add sched tracepoints for RV task model rv: Add license identifiers to monitor files tracing: Fix DECLARE_TRACE_CONDITION trace/osnoise: Add trace events for samples
2 parents 31eb415 + 4ffef95 commit 88221ac

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+2135
-166
lines changed
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
============
4+
rv-mon-sched
5+
============
6+
-----------------------------
7+
Scheduler monitors collection
8+
-----------------------------
9+
10+
:Manual section: 1
11+
12+
SYNOPSIS
13+
========
14+
15+
**rv mon sched** [*OPTIONS*]
16+
17+
**rv mon <NESTED_MONITOR>** [*OPTIONS*]
18+
19+
**rv mon sched:<NESTED_MONITOR>** [*OPTIONS*]
20+
21+
DESCRIPTION
22+
===========
23+
24+
The scheduler monitor collection is a container for several monitors to model
25+
the behaviour of the scheduler. Each monitor describes a specification that
26+
the scheduler should follow.
27+
28+
As a monitor container, it will enable all nested monitors and set them
29+
according to OPTIONS.
30+
Nevertheless nested monitors can also be activated independently both by name
31+
and by specifying sched: , e.g. to enable only monitor tss you can do any of:
32+
33+
# rv mon sched:tss
34+
35+
# rv mon tss
36+
37+
See kernel documentation for further information about this monitor:
38+
<https://docs.kernel.org/trace/rv/monitor_sched.html>
39+
40+
OPTIONS
41+
=======
42+
43+
.. include:: common_ikm.rst
44+
45+
NESTED MONITOR
46+
==============
47+
48+
The available nested monitors are:
49+
* scpd: schedule called with preemption disabled
50+
* snep: schedule does not enable preempt
51+
* sncid: schedule not called with interrupt disabled
52+
* snroc: set non runnable on its own context
53+
* sco: scheduling context operations
54+
* tss: task switch while scheduling
55+
56+
SEE ALSO
57+
========
58+
59+
**rv**\(1), **rv-mon**\(1)
60+
61+
Linux kernel *RV* documentation:
62+
<https://www.kernel.org/doc/html/latest/trace/rv/index.html>
63+
64+
AUTHOR
65+
======
66+
67+
Written by Gabriele Monaco <gmonaco@redhat.com>
68+
69+
.. include:: common_appendix.rst
Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
Scheduler monitors
2+
==================
3+
4+
- Name: sched
5+
- Type: container for multiple monitors
6+
- Author: Gabriele Monaco <gmonaco@redhat.com>, Daniel Bristot de Oliveira <bristot@kernel.org>
7+
8+
Description
9+
-----------
10+
11+
Monitors describing complex systems, such as the scheduler, can easily grow to
12+
the point where they are just hard to understand because of the many possible
13+
state transitions.
14+
Often it is possible to break such descriptions into smaller monitors,
15+
sharing some or all events. Enabling those smaller monitors concurrently is,
16+
in fact, testing the system as if we had one single larger monitor.
17+
Splitting models into multiple specification is not only easier to
18+
understand, but gives some more clues when we see errors.
19+
20+
The sched monitor is a set of specifications to describe the scheduler behaviour.
21+
It includes several per-cpu and per-task monitors that work independently to verify
22+
different specifications the scheduler should follow.
23+
24+
To make this system as straightforward as possible, sched specifications are *nested*
25+
monitors, whereas sched itself is a *container*.
26+
From the interface perspective, sched includes other monitors as sub-directories,
27+
enabling/disabling or setting reactors to sched, propagates the change to all monitors,
28+
however single monitors can be used independently as well.
29+
30+
It is important that future modules are built after their container (sched, in
31+
this case), otherwise the linker would not respect the order and the nesting
32+
wouldn't work as expected.
33+
To do so, simply add them after sched in the Makefile.
34+
35+
Specifications
36+
--------------
37+
38+
The specifications included in sched are currently a work in progress, adapting the ones
39+
defined in by Daniel Bristot in [1].
40+
41+
Currently we included the following:
42+
43+
Monitor tss
44+
~~~~~~~~~~~
45+
46+
The task switch while scheduling (tss) monitor ensures a task switch happens
47+
only in scheduling context, that is inside a call to `__schedule`::
48+
49+
|
50+
|
51+
v
52+
+-----------------+
53+
| thread | <+
54+
+-----------------+ |
55+
| |
56+
| schedule_entry | schedule_exit
57+
v |
58+
sched_switch |
59+
+--------------- |
60+
| sched |
61+
+--------------> -+
62+
63+
Monitor sco
64+
~~~~~~~~~~~
65+
66+
The scheduling context operations (sco) monitor ensures changes in a task state
67+
happen only in thread context::
68+
69+
70+
|
71+
|
72+
v
73+
sched_set_state +------------------+
74+
+------------------ | |
75+
| | thread_context |
76+
+-----------------> | | <+
77+
+------------------+ |
78+
| |
79+
| schedule_entry | schedule_exit
80+
v |
81+
|
82+
scheduling_context -+
83+
84+
Monitor snroc
85+
~~~~~~~~~~~~~
86+
87+
The set non runnable on its own context (snroc) monitor ensures changes in a
88+
task state happens only in the respective task's context. This is a per-task
89+
monitor::
90+
91+
|
92+
|
93+
v
94+
+------------------+
95+
| other_context | <+
96+
+------------------+ |
97+
| |
98+
| sched_switch_in | sched_switch_out
99+
v |
100+
sched_set_state |
101+
+------------------ |
102+
| own_context |
103+
+-----------------> -+
104+
105+
Monitor scpd
106+
~~~~~~~~~~~~
107+
108+
The schedule called with preemption disabled (scpd) monitor ensures schedule is
109+
called with preemption disabled::
110+
111+
|
112+
|
113+
v
114+
+------------------+
115+
| cant_sched | <+
116+
+------------------+ |
117+
| |
118+
| preempt_disable | preempt_enable
119+
v |
120+
schedule_entry |
121+
schedule_exit |
122+
+----------------- can_sched |
123+
| |
124+
+----------------> -+
125+
126+
Monitor snep
127+
~~~~~~~~~~~~
128+
129+
The schedule does not enable preempt (snep) monitor ensures a schedule call
130+
does not enable preemption::
131+
132+
|
133+
|
134+
v
135+
preempt_disable +------------------------+
136+
preempt_enable | |
137+
+------------------ | non_scheduling_context |
138+
| | |
139+
+-----------------> | | <+
140+
+------------------------+ |
141+
| |
142+
| schedule_entry | schedule_exit
143+
v |
144+
|
145+
scheduling_contex -+
146+
147+
Monitor sncid
148+
~~~~~~~~~~~~~
149+
150+
The schedule not called with interrupt disabled (sncid) monitor ensures
151+
schedule is not called with interrupt disabled::
152+
153+
|
154+
|
155+
v
156+
schedule_entry +--------------+
157+
schedule_exit | |
158+
+----------------- | can_sched |
159+
| | |
160+
+----------------> | | <+
161+
+--------------+ |
162+
| |
163+
| irq_disable | irq_enable
164+
v |
165+
|
166+
cant_sched -+
167+
168+
References
169+
----------
170+
171+
[1] - https://bristot.me/linux-task-model

include/linux/rv.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
#ifndef _LINUX_RV_H
88
#define _LINUX_RV_H
99

10-
#define MAX_DA_NAME_LEN 24
10+
#define MAX_DA_NAME_LEN 32
1111

1212
#ifdef CONFIG_RV
1313
/*
@@ -56,7 +56,7 @@ struct rv_monitor {
5656

5757
bool rv_monitoring_on(void);
5858
int rv_unregister_monitor(struct rv_monitor *monitor);
59-
int rv_register_monitor(struct rv_monitor *monitor);
59+
int rv_register_monitor(struct rv_monitor *monitor, struct rv_monitor *parent);
6060
int rv_get_task_monitor_slot(void);
6161
void rv_put_task_monitor_slot(int slot);
6262

include/linux/sched.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
#include <linux/rv.h>
4747
#include <linux/livepatch_sched.h>
4848
#include <linux/uidgid_types.h>
49+
#include <linux/tracepoint-defs.h>
4950
#include <asm/kmap_size.h>
5051

5152
/* task_struct member predeclarations (sorted alphabetically): */
@@ -187,6 +188,12 @@ struct user_event_mm;
187188
# define debug_rtlock_wait_restore_state() do { } while (0)
188189
#endif
189190

191+
#define trace_set_current_state(state_value) \
192+
do { \
193+
if (tracepoint_enabled(sched_set_state_tp)) \
194+
__trace_set_current_state(state_value); \
195+
} while (0)
196+
190197
/*
191198
* set_current_state() includes a barrier so that the write of current->__state
192199
* is correctly serialised wrt the caller's subsequent test of whether to
@@ -227,12 +234,14 @@ struct user_event_mm;
227234
#define __set_current_state(state_value) \
228235
do { \
229236
debug_normal_state_change((state_value)); \
237+
trace_set_current_state(state_value); \
230238
WRITE_ONCE(current->__state, (state_value)); \
231239
} while (0)
232240

233241
#define set_current_state(state_value) \
234242
do { \
235243
debug_normal_state_change((state_value)); \
244+
trace_set_current_state(state_value); \
236245
smp_store_mb(current->__state, (state_value)); \
237246
} while (0)
238247

@@ -248,6 +257,7 @@ struct user_event_mm;
248257
\
249258
raw_spin_lock_irqsave(&current->pi_lock, flags); \
250259
debug_special_state_change((state_value)); \
260+
trace_set_current_state(state_value); \
251261
WRITE_ONCE(current->__state, (state_value)); \
252262
raw_spin_unlock_irqrestore(&current->pi_lock, flags); \
253263
} while (0)
@@ -283,6 +293,7 @@ struct user_event_mm;
283293
raw_spin_lock(&current->pi_lock); \
284294
current->saved_state = current->__state; \
285295
debug_rtlock_wait_set_state(); \
296+
trace_set_current_state(TASK_RTLOCK_WAIT); \
286297
WRITE_ONCE(current->__state, TASK_RTLOCK_WAIT); \
287298
raw_spin_unlock(&current->pi_lock); \
288299
} while (0);
@@ -292,6 +303,7 @@ struct user_event_mm;
292303
lockdep_assert_irqs_disabled(); \
293304
raw_spin_lock(&current->pi_lock); \
294305
debug_rtlock_wait_restore_state(); \
306+
trace_set_current_state(current->saved_state); \
295307
WRITE_ONCE(current->__state, current->saved_state); \
296308
current->saved_state = TASK_RUNNING; \
297309
raw_spin_unlock(&current->pi_lock); \
@@ -328,6 +340,10 @@ extern void io_schedule_finish(int token);
328340
extern long io_schedule_timeout(long timeout);
329341
extern void io_schedule(void);
330342

343+
/* wrapper function to trace from this header file */
344+
DECLARE_TRACEPOINT(sched_set_state_tp);
345+
extern void __trace_set_current_state(int state_value);
346+
331347
/**
332348
* struct prev_cputime - snapshot of system and user cputime
333349
* @utime: time spent in user mode

include/trace/define_trace.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,10 @@
7676
#define DECLARE_TRACE(name, proto, args) \
7777
DEFINE_TRACE(name, PARAMS(proto), PARAMS(args))
7878

79+
#undef DECLARE_TRACE_CONDITION
80+
#define DECLARE_TRACE_CONDITION(name, proto, args, cond) \
81+
DEFINE_TRACE(name, PARAMS(proto), PARAMS(args))
82+
7983
/* If requested, create helpers for calling these tracepoints from Rust. */
8084
#ifdef CREATE_RUST_TRACE_POINTS
8185
#undef DEFINE_RUST_DO_TRACE
@@ -108,6 +112,8 @@
108112
/* Make all open coded DECLARE_TRACE nops */
109113
#undef DECLARE_TRACE
110114
#define DECLARE_TRACE(name, proto, args)
115+
#undef DECLARE_TRACE_CONDITION
116+
#define DECLARE_TRACE_CONDITION(name, proto, args, cond)
111117

112118
#ifdef TRACEPOINTS_ENABLED
113119
#include <trace/trace_events.h>
@@ -129,6 +135,7 @@
129135
#undef DEFINE_EVENT_CONDITION
130136
#undef TRACE_HEADER_MULTI_READ
131137
#undef DECLARE_TRACE
138+
#undef DECLARE_TRACE_CONDITION
132139

133140
/* Only undef what we defined in this file */
134141
#ifdef UNDEF_TRACE_INCLUDE_FILE

0 commit comments

Comments
 (0)