Skip to content

Commit 8efd0d9

Browse files
committed
Merge tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "Core ---- - Defer freeing TCP skbs to the BH handler, whenever possible, or at least perform the freeing outside of the socket lock section to decrease cross-CPU allocator work and improve latency. - Add netdevice refcount tracking to locate sources of netdevice and net namespace refcount leaks. - Make Tx watchdog less intrusive - avoid pausing Tx and restarting all queues from a single CPU removing latency spikes. - Various small optimizations throughout the stack from Eric Dumazet. - Make netdev->dev_addr[] constant, force modifications to go via appropriate helpers to allow us to keep addresses in ordered data structures. - Replace unix_table_lock with per-hash locks, improving performance of bind() calls. - Extend skb drop tracepoint with a drop reason. - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW. BPF --- - New helpers: - bpf_find_vma(), find and inspect VMAs for profiling use cases - bpf_loop(), runtime-bounded loop helper trading some execution time for much faster (if at all converging) verification - bpf_strncmp(), improve performance, avoid compiler flakiness - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt() for tracing programs, all inlined by the verifier - Support BPF relocations (CO-RE) in the kernel loader. - Further the support for BTF_TYPE_TAG annotations. - Allow access to local storage in sleepable helpers. - Convert verifier argument types to a composable form with different attributes which can be shared across types (ro, maybe-null). - Prepare libbpf for upcoming v1.0 release by cleaning up APIs, creating new, extensible ones where missing and deprecating those to be removed. Protocols --------- - WiFi (mac80211/cfg80211): - notify user space about long "come back in N" AP responses, allow it to react to such temporary rejections - allow non-standard VHT MCS 10/11 rates - use coarse time in airtime fairness code to save CPU cycles - Bluetooth: - rework of HCI command execution serialization to use a common queue and work struct, and improve handling errors reported in the middle of a batch of commands - rework HCI event handling to use skb_pull_data, avoiding packet parsing pitfalls - support AOSP Bluetooth Quality Report - SMC: - support net namespaces, following the RDMA model - improve connection establishment latency by pre-clearing buffers - introduce TCP ULP for automatic redirection to SMC - Multi-Path TCP: - support ioctls: SIOCINQ, OUTQ, and OUTQNSD - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT, IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY - support cmsgs: TCP_INQ - improvements in the data scheduler (assigning data to subflows) - support fastclose option (quick shutdown of the full MPTCP connection, similar to TCP RST in regular TCP) - MCTP (Management Component Transport) over serial, as defined by DMTF spec DSP0253 - "MCTP Serial Transport Binding". Driver API ---------- - Support timestamping on bond interfaces in active/passive mode. - Introduce generic phylink link mode validation for drivers which don't have any quirks and where MAC capability bits fully express what's supported. Allow PCS layer to participate in the validation. Convert a number of drivers. - Add support to set/get size of buffers on the Rx rings and size of the tx copybreak buffer via ethtool. - Support offloading TC actions as first-class citizens rather than only as attributes of filters, improve sharing and device resource utilization. - WiFi (mac80211/cfg80211): - support forwarding offload (ndo_fill_forward_path) - support for background radar detection hardware - SA Query Procedures offload on the AP side New hardware / drivers ---------------------- - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with real-time requirements for isochronous communication with protocols like OPC UA Pub/Sub. - Qualcomm BAM-DMUX WWAN - driver for data channels of modems integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974 (qcom_bam_dmux). - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver with support for bridging, VLANs and multicast forwarding (lan966x). - iwlmei driver for co-operating between Intel's WiFi driver and Intel's Active Management Technology (AMT) devices. - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips - Bluetooth: - MediaTek MT7921 SDIO devices - Foxconn MT7922A - Realtek RTL8852AE Drivers ------- - Significantly improve performance in the datapaths of: lan78xx, ax88179_178a, lantiq_xrx200, bnxt. - Intel Ethernet NICs: - igb: support PTP/time PEROUT and EXTTS SDP functions on 82580/i354/i350 adapters - ixgbevf: new PF -> VF mailbox API which avoids the risk of mailbox corruption with ESXi - iavf: support configuration of VLAN features of finer granularity, stacked tags and filtering - ice: PTP support for new E822 devices with sub-ns precision - ice: support firmware activation without reboot - Mellanox Ethernet NICs (mlx5): - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - support TC forwarding when tunnel encap and decap happen between two ports of the same NIC - dynamically size and allow disabling various features to save resources for running in embedded / SmartNIC scenarios - Broadcom Ethernet NICs (bnxt): - use page frag allocator to improve Rx performance - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - Other Ethernet NICs: - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support - Microsoft cloud/virtual NIC (mana): - add XDP support (PASS, DROP, TX) - Mellanox Ethernet switches (mlxsw): - initial support for Spectrum-4 ASICs - VxLAN with IPv6 underlay - Marvell Ethernet switches (prestera): - support flower flow templates - add basic IP forwarding support - NXP embedded Ethernet switches (ocelot & felix): - support Per-Stream Filtering and Policing (PSFP) - enable cut-through forwarding between ports by default - support FDMA to improve packet Rx/Tx to CPU - Other embedded switches: - hellcreek: improve trapping management (STP and PTP) packets - qca8k: support link aggregation and port mirroring - Qualcomm 802.11ax WiFi (ath11k): - qca6390, wcn6855: enable 802.11 power save mode in station mode - BSS color change support - WCN6855 hw2.1 support - 11d scan offload support - scan MAC address randomization support - full monitor mode, only supported on QCN9074 - qca6390/wcn6855: report signal and tx bitrate - qca6390: rfkill support - qca6390/wcn6855: regdb.bin support - Intel WiFi (iwlwifi): - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS) in cooperation with the BIOS - support for Optimized Connectivity Experience (OCE) scan - support firmware API version 68 - lots of preparatory work for the upcoming Bz device family - MediaTek WiFi (mt76): - Specific Absorption Rate (SAR) support - mt7921: 160 MHz channel support - RealTek WiFi (rtw88): - Specific Absorption Rate (SAR) support - scan offload - Other WiFi NICs - ath10k: support fetching (pre-)calibration data from nvmem - brcmfmac: configure keep-alive packet on suspend - wcn36xx: beacon filter support" * tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2048 commits) tcp: tcp_send_challenge_ack delete useless param `skb` net/qla3xxx: Remove useless DMA-32 fallback configuration rocker: Remove useless DMA-32 fallback configuration hinic: Remove useless DMA-32 fallback configuration lan743x: Remove useless DMA-32 fallback configuration net: enetc: Remove useless DMA-32 fallback configuration cxgb4vf: Remove useless DMA-32 fallback configuration cxgb4: Remove useless DMA-32 fallback configuration cxgb3: Remove useless DMA-32 fallback configuration bnx2x: Remove useless DMA-32 fallback configuration et131x: Remove useless DMA-32 fallback configuration be2net: Remove useless DMA-32 fallback configuration vmxnet3: Remove useless DMA-32 fallback configuration bna: Simplify DMA setting net: alteon: Simplify DMA setting myri10ge: Simplify DMA setting qlcnic: Simplify DMA setting net: allwinner: Fix print format page_pool: remove spinlock in page_pool_refill_alloc_cache() amt: fix wrong return type of amt_send_membership_update() ...
2 parents 9bcbf89 + 8aaaf2f commit 8efd0d9

File tree

2,006 files changed

+111849
-44860
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

2,006 files changed

+111849
-44860
lines changed

Documentation/bpf/btf.rst

Lines changed: 34 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ BPF Type Format (BTF)
33
=====================
44

55
1. Introduction
6-
***************
6+
===============
77

88
BTF (BPF Type Format) is the metadata format which encodes the debug info
99
related to BPF program/map. The name BTF was used initially to describe data
@@ -30,7 +30,7 @@ sections are discussed in details in :ref:`BTF_Type_String`.
3030
.. _BTF_Type_String:
3131

3232
2. BTF Type and String Encoding
33-
*******************************
33+
===============================
3434

3535
The file ``include/uapi/linux/btf.h`` provides high-level definition of how
3636
types/strings are encoded.
@@ -57,13 +57,13 @@ little-endian target. The ``btf_header`` is designed to be extensible with
5757
generated.
5858

5959
2.1 String Encoding
60-
===================
60+
-------------------
6161

6262
The first string in the string section must be a null string. The rest of
6363
string table is a concatenation of other null-terminated strings.
6464

6565
2.2 Type Encoding
66-
=================
66+
-----------------
6767

6868
The type id ``0`` is reserved for ``void`` type. The type section is parsed
6969
sequentially and type id is assigned to each recognized type starting from id
@@ -86,6 +86,7 @@ sequentially and type id is assigned to each recognized type starting from id
8686
#define BTF_KIND_DATASEC 15 /* Section */
8787
#define BTF_KIND_FLOAT 16 /* Floating point */
8888
#define BTF_KIND_DECL_TAG 17 /* Decl Tag */
89+
#define BTF_KIND_TYPE_TAG 18 /* Type Tag */
8990

9091
Note that the type section encodes debug info, not just pure types.
9192
``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
@@ -107,7 +108,7 @@ Each type contains the following common data::
107108
* "size" tells the size of the type it is describing.
108109
*
109110
* "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
110-
* FUNC, FUNC_PROTO and DECL_TAG.
111+
* FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG.
111112
* "type" is a type_id referring to another type.
112113
*/
113114
union {
@@ -492,8 +493,18 @@ the attribute is applied to a ``struct``/``union`` member or
492493
a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a
493494
valid index (starting from 0) pointing to a member or an argument.
494495

496+
2.2.17 BTF_KIND_TYPE_TAG
497+
~~~~~~~~~~~~~~~~~~~~~~~~
498+
499+
``struct btf_type`` encoding requirement:
500+
* ``name_off``: offset to a non-empty string
501+
* ``info.kind_flag``: 0
502+
* ``info.kind``: BTF_KIND_TYPE_TAG
503+
* ``info.vlen``: 0
504+
* ``type``: the type with ``btf_type_tag`` attribute
505+
495506
3. BTF Kernel API
496-
*****************
507+
=================
497508

498509
The following bpf syscall command involves BTF:
499510
* BPF_BTF_LOAD: load a blob of BTF data into kernel
@@ -536,14 +547,14 @@ The workflow typically looks like:
536547

537548

538549
3.1 BPF_BTF_LOAD
539-
================
550+
----------------
540551

541552
Load a blob of BTF data into kernel. A blob of data, described in
542553
:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
543554
is returned to a userspace.
544555

545556
3.2 BPF_MAP_CREATE
546-
==================
557+
------------------
547558

548559
A map can be created with ``btf_fd`` and specified key/value type id.::
549560

@@ -570,7 +581,7 @@ automatically.
570581
.. _BPF_Prog_Load:
571582

572583
3.3 BPF_PROG_LOAD
573-
=================
584+
-----------------
574585

575586
During prog_load, func_info and line_info can be passed to kernel with proper
576587
values for the following attributes:
@@ -620,7 +631,7 @@ For line_info, the line number and column number are defined as below:
620631
#define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff)
621632

622633
3.4 BPF_{PROG,MAP}_GET_NEXT_ID
623-
==============================
634+
------------------------------
624635

625636
In kernel, every loaded program, map or btf has a unique id. The id won't
626637
change during the lifetime of a program, map, or btf.
@@ -630,13 +641,13 @@ each command, to user space, for bpf program or maps, respectively, so an
630641
inspection tool can inspect all programs and maps.
631642

632643
3.5 BPF_{PROG,MAP}_GET_FD_BY_ID
633-
===============================
644+
-------------------------------
634645

635646
An introspection tool cannot use id to get details about program or maps.
636647
A file descriptor needs to be obtained first for reference-counting purpose.
637648

638649
3.6 BPF_OBJ_GET_INFO_BY_FD
639-
==========================
650+
--------------------------
640651

641652
Once a program/map fd is acquired, an introspection tool can get the detailed
642653
information from kernel about this fd, some of which are BTF-related. For
@@ -645,7 +656,7 @@ example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
645656
bpf byte codes, and jited_line_info.
646657

647658
3.7 BPF_BTF_GET_FD_BY_ID
648-
========================
659+
------------------------
649660

650661
With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
651662
syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
@@ -657,18 +668,18 @@ tool has full btf knowledge and is able to pretty print map key/values, dump
657668
func signatures and line info, along with byte/jit codes.
658669

659670
4. ELF File Format Interface
660-
****************************
671+
============================
661672

662673
4.1 .BTF section
663-
================
674+
----------------
664675

665676
The .BTF section contains type and string data. The format of this section is
666677
same as the one describe in :ref:`BTF_Type_String`.
667678

668679
.. _BTF_Ext_Section:
669680

670681
4.2 .BTF.ext section
671-
====================
682+
--------------------
672683

673684
The .BTF.ext section encodes func_info and line_info which needs loader
674685
manipulation before loading into the kernel.
@@ -732,7 +743,7 @@ bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
732743
beginning of section (``btf_ext_info_sec->sec_name_off``).
733744

734745
4.2 .BTF_ids section
735-
====================
746+
--------------------
736747

737748
The .BTF_ids section encodes BTF ID values that are used within the kernel.
738749

@@ -793,10 +804,10 @@ All the BTF ID lists and sets are compiled in the .BTF_ids section and
793804
resolved during the linking phase of kernel build by ``resolve_btfids`` tool.
794805

795806
5. Using BTF
796-
************
807+
============
797808

798809
5.1 bpftool map pretty print
799-
============================
810+
----------------------------
800811

801812
With BTF, the map key/value can be printed based on fields rather than simply
802813
raw bytes. This is especially valuable for large structure or if your data
@@ -838,7 +849,7 @@ bpftool is able to pretty print like below:
838849
]
839850

840851
5.2 bpftool prog dump
841-
=====================
852+
---------------------
842853

843854
The following is an example showing how func_info and line_info can help prog
844855
dump with better kernel symbol names, function prototypes and line
@@ -872,7 +883,7 @@ information.::
872883
[...]
873884

874885
5.3 Verifier Log
875-
================
886+
----------------
876887

877888
The following is an example of how line_info can help debugging verification
878889
failure.::
@@ -898,7 +909,7 @@ failure.::
898909
R2 offset is outside of the packet
899910

900911
6. BTF Generation
901-
*****************
912+
=================
902913

903914
You need latest pahole
904915

@@ -1005,6 +1016,6 @@ format.::
10051016
.long 8206 # Line 8 Col 14
10061017

10071018
7. Testing
1008-
**********
1019+
==========
10091020

10101021
Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests.

0 commit comments

Comments
 (0)