Skip to content

Commit 53683e4

Browse files
committed
Merge tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing ring buffer updates from Steven Rostedt: "Add ring_buffer memory mappings. The tracing ring buffer was created based on being mostly used with the splice system call. It is broken up into page ordered sub-buffers and the reader swaps a new sub-buffer with an existing sub-buffer that's part of the write buffer. It then has total access to the swapped out sub-buffer and can do copyless movements of the memory into other mediums (file system, network, etc). The buffer is great for passing around the ring buffer contents in the kernel, but is not so good for when the consumer is the user space task itself. A new interface is added that allows user space to memory map the ring buffer. It will get all the write sub-buffers as well as reader sub-buffer (that is not written to). It can send an ioctl to change which sub-buffer is the new reader sub-buffer. The ring buffer is read only to user space. It only needs to call the ioctl when it is finished with a sub-buffer and needs a new sub-buffer that the writer will not write over. A self test program was also created for testing and can be used as an example for the interface to user space. The libtracefs (external to the kernel) also has code that interacts with this, although it is disabled until the interface is in a official release. It can be enabled by compiling the library with a special flag. This was used for testing applications that perform better with the buffer being mapped. Memory mapped buffers have limitations. The main one is that it can not be used with the snapshot logic. If the buffer is mapped, snapshots will be disabled. If any logic is set to trigger snapshots on a buffer, that buffer will not be allowed to be mapped" * tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Add cast to unsigned long addr passed to virt_to_page() ring-buffer: Have mmapped ring buffer keep track of missed events ring-buffer/selftest: Add ring-buffer mapping test Documentation: tracing: Add ring-buffer mapping tracing: Allow user-space mapping of the ring-buffer ring-buffer: Introducing ring-buffer mapping functions ring-buffer: Allocate sub-buffers with __GFP_COMP
2 parents 594d281 + b9c6820 commit 53683e4

File tree

11 files changed

+1026
-16
lines changed

11 files changed

+1026
-16
lines changed

Documentation/trace/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Linux Tracing Technologies
2929
timerlat-tracer
3030
intel_th
3131
ring-buffer-design
32+
ring-buffer-map
3233
stm
3334
sys-t
3435
coresight/index
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
==================================
4+
Tracefs ring-buffer memory mapping
5+
==================================
6+
7+
:Author: Vincent Donnefort <vdonnefort@google.com>
8+
9+
Overview
10+
========
11+
Tracefs ring-buffer memory map provides an efficient method to stream data
12+
as no memory copy is necessary. The application mapping the ring-buffer becomes
13+
then a consumer for that ring-buffer, in a similar fashion to trace_pipe.
14+
15+
Memory mapping setup
16+
====================
17+
The mapping works with a mmap() of the trace_pipe_raw interface.
18+
19+
The first system page of the mapping contains ring-buffer statistics and
20+
description. It is referred to as the meta-page. One of the most important
21+
fields of the meta-page is the reader. It contains the sub-buffer ID which can
22+
be safely read by the mapper (see ring-buffer-design.rst).
23+
24+
The meta-page is followed by all the sub-buffers, ordered by ascending ID. It is
25+
therefore effortless to know where the reader starts in the mapping:
26+
27+
.. code-block:: c
28+
29+
reader_id = meta->reader->id;
30+
reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size;
31+
32+
When the application is done with the current reader, it can get a new one using
33+
the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates
34+
the meta-page fields.
35+
36+
Limitations
37+
===========
38+
When a mapping is in place on a Tracefs ring-buffer, it is not possible to
39+
either resize it (either by increasing the entire size of the ring-buffer or
40+
each subbuf). It is also not possible to use snapshot and causes splice to copy
41+
the ring buffer data instead of using the copyless swap from the ring buffer.
42+
43+
Concurrent readers (either another application mapping that ring-buffer or the
44+
kernel with trace_pipe) are allowed but not recommended. They will compete for
45+
the ring-buffer and the output is unpredictable, just like concurrent readers on
46+
trace_pipe would be.
47+
48+
Example
49+
=======
50+
51+
.. code-block:: c
52+
53+
#include <fcntl.h>
54+
#include <stdio.h>
55+
#include <stdlib.h>
56+
#include <unistd.h>
57+
58+
#include <linux/trace_mmap.h>
59+
60+
#include <sys/mman.h>
61+
#include <sys/ioctl.h>
62+
63+
#define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw"
64+
65+
int main(void)
66+
{
67+
int page_size = getpagesize(), fd, reader_id;
68+
unsigned long meta_len, data_len;
69+
struct trace_buffer_meta *meta;
70+
void *map, *reader, *data;
71+
72+
fd = open(TRACE_PIPE_RAW, O_RDONLY | O_NONBLOCK);
73+
if (fd < 0)
74+
exit(EXIT_FAILURE);
75+
76+
map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
77+
if (map == MAP_FAILED)
78+
exit(EXIT_FAILURE);
79+
80+
meta = (struct trace_buffer_meta *)map;
81+
meta_len = meta->meta_page_size;
82+
83+
printf("entries: %llu\n", meta->entries);
84+
printf("overrun: %llu\n", meta->overrun);
85+
printf("read: %llu\n", meta->read);
86+
printf("nr_subbufs: %u\n", meta->nr_subbufs);
87+
88+
data_len = meta->subbuf_size * meta->nr_subbufs;
89+
data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, meta_len);
90+
if (data == MAP_FAILED)
91+
exit(EXIT_FAILURE);
92+
93+
if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0)
94+
exit(EXIT_FAILURE);
95+
96+
reader_id = meta->reader.id;
97+
reader = data + meta->subbuf_size * reader_id;
98+
99+
printf("Current reader address: %p\n", reader);
100+
101+
munmap(data, data_len);
102+
munmap(meta, meta_len);
103+
close (fd);
104+
105+
return 0;
106+
}

include/linux/ring_buffer.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
#include <linux/seq_file.h>
77
#include <linux/poll.h>
88

9+
#include <uapi/linux/trace_mmap.h>
10+
911
struct trace_buffer;
1012
struct ring_buffer_iter;
1113

@@ -223,4 +225,8 @@ int trace_rb_cpu_prepare(unsigned int cpu, struct hlist_node *node);
223225
#define trace_rb_cpu_prepare NULL
224226
#endif
225227

228+
int ring_buffer_map(struct trace_buffer *buffer, int cpu,
229+
struct vm_area_struct *vma);
230+
int ring_buffer_unmap(struct trace_buffer *buffer, int cpu);
231+
int ring_buffer_map_get_reader(struct trace_buffer *buffer, int cpu);
226232
#endif /* _LINUX_RING_BUFFER_H */

include/uapi/linux/trace_mmap.h

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
2+
#ifndef _TRACE_MMAP_H_
3+
#define _TRACE_MMAP_H_
4+
5+
#include <linux/types.h>
6+
7+
/**
8+
* struct trace_buffer_meta - Ring-buffer Meta-page description
9+
* @meta_page_size: Size of this meta-page.
10+
* @meta_struct_len: Size of this structure.
11+
* @subbuf_size: Size of each sub-buffer.
12+
* @nr_subbufs: Number of subbfs in the ring-buffer, including the reader.
13+
* @reader.lost_events: Number of events lost at the time of the reader swap.
14+
* @reader.id: subbuf ID of the current reader. ID range [0 : @nr_subbufs - 1]
15+
* @reader.read: Number of bytes read on the reader subbuf.
16+
* @flags: Placeholder for now, 0 until new features are supported.
17+
* @entries: Number of entries in the ring-buffer.
18+
* @overrun: Number of entries lost in the ring-buffer.
19+
* @read: Number of entries that have been read.
20+
* @Reserved1: Internal use only.
21+
* @Reserved2: Internal use only.
22+
*/
23+
struct trace_buffer_meta {
24+
__u32 meta_page_size;
25+
__u32 meta_struct_len;
26+
27+
__u32 subbuf_size;
28+
__u32 nr_subbufs;
29+
30+
struct {
31+
__u64 lost_events;
32+
__u32 id;
33+
__u32 read;
34+
} reader;
35+
36+
__u64 flags;
37+
38+
__u64 entries;
39+
__u64 overrun;
40+
__u64 read;
41+
42+
__u64 Reserved1;
43+
__u64 Reserved2;
44+
};
45+
46+
#define TRACE_MMAP_IOCTL_GET_READER _IO('T', 0x1)
47+
48+
#endif /* _TRACE_MMAP_H_ */

0 commit comments

Comments
 (0)