Skip to content

Commit 641fdea

Browse files
Gregory Pricedavejiang
authored andcommitted
cxl: docs/linux/memory-hotplug
Add documentation on how the CXL driver surfaces memory through the DAX driver and memory-hotplug. Signed-off-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20250512162134.3596150-13-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
1 parent 36e9f71 commit 641fdea

File tree

2 files changed

+79
-0
lines changed

2 files changed

+79
-0
lines changed

Documentation/driver-api/cxl/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ that have impacts on each other. The docs here break up configurations steps.
3737
linux/early-boot
3838
linux/cxl-driver
3939
linux/dax-driver
40+
linux/memory-hotplug
4041
linux/access-coordinates
4142

4243

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
==============
4+
Memory Hotplug
5+
==============
6+
The final phase of surfacing CXL memory to the kernel page allocator is for
7+
the `DAX` driver to surface a `Driver Managed` memory region via the
8+
memory-hotplug component.
9+
10+
There are four major configurations to consider:
11+
12+
1) Default Online Behavior (on/off and zone)
13+
2) Hotplug Memory Block size
14+
3) Memory Map Resource location
15+
4) Driver-Managed Memory Designation
16+
17+
Default Online Behavior
18+
=======================
19+
The default-online behavior of hotplug memory is dictated by the following,
20+
in order of precedence:
21+
22+
- :code:`CONFIG_MHP_DEFAULT_ONLINE_TYPE` Build Configuration
23+
- :code:`memhp_default_state` Boot parameter
24+
- :code:`/sys/devices/system/memory/auto_online_blocks` value
25+
26+
These dictate whether hotplugged memory blocks arrive in one of three states:
27+
28+
1) Offline
29+
2) Online in :code:`ZONE_NORMAL`
30+
3) Online in :code:`ZONE_MOVABLE`
31+
32+
:code:`ZONE_NORMAL` implies this capacity may be used for almost any allocation,
33+
while :code:`ZONE_MOVABLE` implies this capacity should only be used for
34+
migratable allocations.
35+
36+
:code:`ZONE_MOVABLE` attempts to retain the hotplug-ability of a memory block
37+
so that it the entire region may be hot-unplugged at a later time. Any capacity
38+
onlined into :code:`ZONE_NORMAL` should be considered permanently attached to
39+
the page allocator.
40+
41+
Hotplug Memory Block Size
42+
=========================
43+
By default, on most architectures, the Hotplug Memory Block Size is either
44+
128MB or 256MB. On x86, the block size increases up to 2GB as total memory
45+
capacity exceeds 64GB. As of v6.15, Linux does not take into account the
46+
size and alignment of the ACPI CEDT CFMWS regions (see Early Boot docs) when
47+
deciding the Hotplug Memory Block Size.
48+
49+
Memory Map
50+
==========
51+
The location of :code:`struct folio` allocations to represent the hotplugged
52+
memory capacity are dictated by the following system settings:
53+
54+
- :code:`/sys_module/memory_hotplug/parameters/memmap_on_memory`
55+
- :code:`/sys/bus/dax/devices/daxN.Y/memmap_on_memory`
56+
57+
If both of these parameters are set to true, :code:`struct folio` for this
58+
capacity will be carved out of the memory block being onlined. This has
59+
performance implications if the memory is particularly high-latency and
60+
its :code:`struct folio` becomes hotly contended.
61+
62+
If either parameter is set to false, :code:`struct folio` for this capacity
63+
will be allocated from the local node of the processor running the hotplug
64+
procedure. This capacity will be allocated from :code:`ZONE_NORMAL` on
65+
that node, as it is a :code:`GFP_KERNEL` allocation.
66+
67+
Systems with extremely large amounts of :code:`ZONE_MOVABLE` memory (e.g.
68+
CXL memory pools) must ensure that there is sufficient local
69+
:code:`ZONE_NORMAL` capacity to host the memory map for the hotplugged capacity.
70+
71+
Driver Managed Memory
72+
=====================
73+
The DAX driver surfaces this memory to memory-hotplug as "Driver Managed". This
74+
is not a configurable setting, but it's important to note that driver managed
75+
memory is explicitly excluded from use during kexec. This is required to ensure
76+
any reset or out-of-band operations that the CXL device may be subject to during
77+
a functional system-reboot (such as a reset-on-probe) will not cause portions of
78+
the kexec kernel to be overwritten.

0 commit comments

Comments
 (0)