Skip to content

Commit d1ba364

Browse files
Gregory Pricedavejiang
authored andcommitted
cxl: docs/platform/acpi reference documentation
Add basic ACPI table information needed to understand the CXL driver probe process. Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250512162134.3596150-6-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
1 parent e4528b9 commit d1ba364

File tree

7 files changed

+264
-0
lines changed

7 files changed

+264
-0
lines changed

Documentation/driver-api/cxl/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ that have impacts on each other. The docs here break up configurations steps.
2626
:caption: Platform Configuration
2727

2828
platform/bios-and-efi
29+
platform/acpi
2930

3031
.. toctree::
3132
:maxdepth: 1
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
===========
4+
ACPI Tables
5+
===========
6+
7+
ACPI is the "Advanced Configuration and Power Interface", which is a standard
8+
that defines how platforms and OS manage power and configure computer hardware.
9+
For the purpose of this theory of operation, when referring to "ACPI" we will
10+
usually refer to "ACPI Tables" - which are the way a platform (BIOS/EFI)
11+
communicates static configuration information to the operation system.
12+
13+
The Following ACPI tables contain *static* configuration and performance data
14+
about CXL devices.
15+
16+
.. toctree::
17+
:maxdepth: 1
18+
19+
acpi/cedt.rst
20+
acpi/srat.rst
21+
acpi/hmat.rst
22+
acpi/slit.rst
23+
acpi/dsdt.rst
24+
25+
The SRAT table may also contain generic port/initiator content that is intended
26+
to describe the generic port, but not information about the rest of the path to
27+
the endpoint.
28+
29+
Linux uses these tables to configure kernel resources for statically configured
30+
(by BIOS/EFI) CXL devices, such as:
31+
32+
- NUMA nodes
33+
- Memory Tiers
34+
- NUMA Abstract Distances
35+
- SystemRAM Memory Regions
36+
- Weighted Interleave Node Weights
37+
38+
ACPI Debugging
39+
==============
40+
41+
The :code:`acpidump -b` command dumps the ACPI tables into binary format.
42+
43+
The :code:`iasl -d` command disassembles the files into human readable format.
44+
45+
Example :code:`acpidump -b && iasl -d cedt.dat` ::
46+
47+
[000h 0000 4] Signature : "CEDT" [CXL Early Discovery Table]
48+
49+
Common Issues
50+
-------------
51+
Most failures described here result in a failure of the driver to surface
52+
memory as a DAX device and/or kmem.
53+
54+
* CEDT CFMWS targets list UIDs do not match CEDT CHBS UIDs.
55+
* CEDT CFMWS targets list UIDs do not match DSDT CXL Host Bridge UIDs.
56+
* CEDT CFMWS Restriction Bits are not correct.
57+
* CEDT CFMWS Memory regions are poorly aligned.
58+
* CEDT CFMWS Memory regions spans a platform memory hole.
59+
* CEDT CHBS UIDs do not match DSDT CXL Host Bridge UIDs.
60+
* CEDT CHBS Specification version is incorrect.
61+
* SRAT is missing regions described in CEDT CFMWS.
62+
63+
* Result: failure to create a NUMA node for the region, or
64+
region is placed in wrong node.
65+
66+
* HMAT is missing data for regions described in CEDT CFMWS.
67+
68+
* Result: NUMA node being placed in the wrong memory tier.
69+
70+
* SLIT has bad data.
71+
72+
* Result: Lots of performance mechanisms in the kernel will be very unhappy.
73+
74+
All of these issues will appear to users as if the driver is failing to
75+
support CXL - when in reality they are all the failure of a platform to
76+
configure the ACPI tables correctly.
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
================================
4+
CEDT - CXL Early Discovery Table
5+
================================
6+
7+
The CXL Early Discovery Table is generated by BIOS to describe the CXL memory
8+
regions configured at boot by the BIOS.
9+
10+
CHBS
11+
====
12+
The CXL Host Bridge Structure describes CXL host bridges. Other than describing
13+
device register information, it reports the specific host bridge UID for this
14+
host bridge. These host bridge ID's will be referenced in other tables.
15+
16+
Example ::
17+
18+
Subtable Type : 00 [CXL Host Bridge Structure]
19+
Reserved : 00
20+
Length : 0020
21+
Associated host bridge : 00000007 <- Host bridge _UID
22+
Specification version : 00000001
23+
Reserved : 00000000
24+
Register base : 0000010370400000
25+
Register length : 0000000000010000
26+
27+
CFMWS
28+
=====
29+
The CXL Fixed Memory Window structure describes a memory region associated
30+
with one or more CXL host bridges (as described by the CHBS). It additionally
31+
describes any inter-host-bridge interleave configuration that may have been
32+
programmed by BIOS.
33+
34+
Example ::
35+
36+
Subtable Type : 01 [CXL Fixed Memory Window Structure]
37+
Reserved : 00
38+
Length : 002C
39+
Reserved : 00000000
40+
Window base address : 000000C050000000 <- Memory Region
41+
Window size : 0000003CA0000000
42+
Interleave Members (2^n) : 01 <- Interleave configuration
43+
Interleave Arithmetic : 00
44+
Reserved : 0000
45+
Granularity : 00000000
46+
Restrictions : 0006
47+
QtgId : 0001
48+
First Target : 00000007 <- Host Bridge _UID
49+
Next Target : 00000006 <- Host Bridge _UID
50+
51+
The restriction field dictates what this SPA range may be used for (memory type,
52+
voltile vs persistent, etc). One or more bits may be set. ::
53+
54+
Bit[0]: CXL Type 2 Memory
55+
Bit[1]: CXL Type 3 Memory
56+
Bit[2]: Volatile Memory
57+
Bit[3]: Persistent Memory
58+
Bit[4]: Fixed Config (HPA cannot be re-used)
59+
60+
INTRA-host-bridge interleave (multiple devices on one host bridge) is NOT
61+
reported in this structure, and is solely defined via CXL device decoder
62+
programming (host bridge and endpoint decoders).
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
==============================================
4+
DSDT - Differentiated system Description Table
5+
==============================================
6+
7+
This table describes what peripherals a machine has.
8+
9+
This table's UIDs for CXL devices - specifically host bridges, must be
10+
consistent with the contents of the CEDT, otherwise the CXL driver will
11+
fail to probe correctly.
12+
13+
Example Compute Express Link Host Bridge ::
14+
15+
Scope (_SB)
16+
{
17+
Device (S0D0)
18+
{
19+
Name (_HID, "ACPI0016" /* Compute Express Link Host Bridge */) // _HID: Hardware ID
20+
Name (_CID, Package (0x02) // _CID: Compatible ID
21+
{
22+
EisaId ("PNP0A08") /* PCI Express Bus */,
23+
EisaId ("PNP0A03") /* PCI Bus */
24+
})
25+
...
26+
Name (_UID, 0x05) // _UID: Unique ID
27+
...
28+
}
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
===========================================
4+
HMAT - Heterogeneous Memory Attribute Table
5+
===========================================
6+
7+
The Heterogeneous Memory Attributes Table contains information such as cache
8+
attributes and bandwidth and latency details for memory proximity domains.
9+
For the purpose of this document, we will only discuss the SSLIB entry.
10+
11+
SLLBI
12+
=====
13+
The System Locality Latency and Bandwidth Information records latency and
14+
bandwidth information for proximity domains.
15+
16+
This table is used by Linux to configure interleave weights and memory tiers.
17+
18+
Example (Heavily truncated for brevity) ::
19+
20+
Structure Type : 0001 [SLLBI]
21+
Data Type : 00 <- Latency
22+
Target Proximity Domain List : 00000000
23+
Target Proximity Domain List : 00000001
24+
Entry : 0080 <- DRAM LTC
25+
Entry : 0100 <- CXL LTC
26+
27+
Structure Type : 0001 [SLLBI]
28+
Data Type : 03 <- Bandwidth
29+
Target Proximity Domain List : 00000000
30+
Target Proximity Domain List : 00000001
31+
Entry : 1200 <- DRAM BW
32+
Entry : 0200 <- CXL BW
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
========================================
4+
SLIT - System Locality Information Table
5+
========================================
6+
7+
The system locality information table provides "abstract distances" between
8+
accessor and memory nodes. Node without initiators (cpus) are infinitely (FF)
9+
distance away from all other nodes.
10+
11+
The abstract distance described in this table does not describe any real
12+
latency of bandwidth information.
13+
14+
Example ::
15+
16+
Signature : "SLIT" [System Locality Information Table]
17+
Localities : 0000000000000004
18+
Locality 0 : 10 20 20 30
19+
Locality 1 : 20 10 30 20
20+
Locality 2 : FF FF 0A FF
21+
Locality 3 : FF FF FF 0A
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
=====================================
4+
SRAT - Static Resource Affinity Table
5+
=====================================
6+
7+
The System/Static Resource Affinity Table describes resource (CPU, Memory)
8+
affinity to "Proximity Domains". This table is technically optional, but for
9+
performance information (see "HMAT") to be enumerated by linux it must be
10+
present.
11+
12+
There is a careful dance between the CEDT and SRAT tables and how NUMA nodes are
13+
created. If things don't look quite the way you expect - check the SRAT Memory
14+
Affinity entries and CEDT CFMWS to determine what your platform actually
15+
supports in terms of flexible topologies.
16+
17+
The SRAT may statically assign portions of a CFMWS SPA range to a specific
18+
proximity domains. See linux numa creation for more information about how
19+
this presents in the NUMA topology.
20+
21+
Proximity Domain
22+
================
23+
A proximity domain is ROUGHLY equivalent to "NUMA Node" - though a 1-to-1
24+
mapping is not guaranteed. There are scenarios where "Proximity Domain 4" may
25+
map to "NUMA Node 3", for example. (See "NUMA Node Creation")
26+
27+
Memory Affinity
28+
===============
29+
Generally speaking, if a host does any amount of CXL fabric (decoder)
30+
programming in BIOS - an SRAT entry for that memory needs to be present.
31+
32+
Example ::
33+
34+
Subtable Type : 01 [Memory Affinity]
35+
Length : 28
36+
Proximity Domain : 00000001 <- NUMA Node 1
37+
Reserved1 : 0000
38+
Base Address : 000000C050000000 <- Physical Memory Region
39+
Address Length : 0000003CA0000000
40+
Reserved2 : 00000000
41+
Flags (decoded below) : 0000000B
42+
Enabled : 1
43+
Hot Pluggable : 1
44+
Non-Volatile : 0

0 commit comments

Comments
 (0)