@@ -19,15 +19,15 @@ Linux kernel. The new mechanism is based on Collaborative Processor
19
19
Performance Control (CPPC) which provides finer grain frequency management
20
20
than legacy ACPI hardware P-States. Current AMD CPU/APU platforms are using
21
21
the ACPI P-states driver to manage CPU frequency and clocks with switching
22
- only in 3 P-states. CPPC replaces the ACPI P-states controls, allows a
22
+ only in 3 P-states. CPPC replaces the ACPI P-states controls and allows a
23
23
flexible, low-latency interface for the Linux kernel to directly
24
24
communicate the performance hints to hardware.
25
25
26
26
``amd-pstate `` leverages the Linux kernel governors such as ``schedutil ``,
27
27
``ondemand ``, etc. to manage the performance hints which are provided by
28
28
CPPC hardware functionality that internally follows the hardware
29
29
specification (for details refer to AMD64 Architecture Programmer's Manual
30
- Volume 2: System Programming [1 ]_). Currently ``amd-pstate `` supports basic
30
+ Volume 2: System Programming [1 ]_). Currently, ``amd-pstate `` supports basic
31
31
frequency control function according to kernel governors on some of the
32
32
Zen2 and Zen3 processors, and we will implement more AMD specific functions
33
33
in future after we verify them on the hardware and SBIOS.
@@ -41,9 +41,9 @@ continuous, abstract, and unit-less performance value in a scale that is
41
41
not tied to a specific performance state / frequency. This is an ACPI
42
42
standard [2 ]_ which software can specify application performance goals and
43
43
hints as a relative target to the infrastructure limits. AMD processors
44
- provides the low latency register model (MSR) instead of AML code
44
+ provide the low latency register model (MSR) instead of an AML code
45
45
interpreter for performance adjustments. ``amd-pstate `` will initialize a
46
- ``struct cpufreq_driver `` instance ``amd_pstate_driver `` with the callbacks
46
+ ``struct cpufreq_driver `` instance, ``amd_pstate_driver ``, with the callbacks
47
47
to manage each performance update behavior. ::
48
48
49
49
Highest Perf ------>+-----------------------+ +-----------------------+
@@ -91,26 +91,26 @@ AMD CPPC Performance Capability
91
91
Highest Performance (RO)
92
92
.........................
93
93
94
- It is the absolute maximum performance an individual processor may reach,
94
+ This is the absolute maximum performance an individual processor may reach,
95
95
assuming ideal conditions. This performance level may not be sustainable
96
96
for long durations and may only be achievable if other platform components
97
- are in a specific state; for example, it may require other processors be in
97
+ are in a specific state; for example, it may require other processors to be in
98
98
an idle state. This would be equivalent to the highest frequencies
99
99
supported by the processor.
100
100
101
101
Nominal (Guaranteed) Performance (RO)
102
102
......................................
103
103
104
- It is the maximum sustained performance level of the processor, assuming
105
- ideal operating conditions. In absence of an external constraint (power,
106
- thermal, etc.) this is the performance level the processor is expected to
104
+ This is the maximum sustained performance level of the processor, assuming
105
+ ideal operating conditions. In the absence of an external constraint (power,
106
+ thermal, etc.), this is the performance level the processor is expected to
107
107
be able to maintain continuously. All cores/processors are expected to be
108
108
able to sustain their nominal performance state simultaneously.
109
109
110
110
Lowest non-linear Performance (RO)
111
111
...................................
112
112
113
- It is the lowest performance level at which nonlinear power savings are
113
+ This is the lowest performance level at which nonlinear power savings are
114
114
achieved, for example, due to the combined effects of voltage and frequency
115
115
scaling. Above this threshold, lower performance levels should be generally
116
116
more energy efficient than higher performance levels. This register
@@ -119,7 +119,7 @@ effectively conveys the most efficient performance level to ``amd-pstate``.
119
119
Lowest Performance (RO)
120
120
........................
121
121
122
- It is the absolute lowest performance level of the processor. Selecting a
122
+ This is the absolute lowest performance level of the processor. Selecting a
123
123
performance level lower than the lowest nonlinear performance level may
124
124
cause an efficiency penalty but should reduce the instantaneous power
125
125
consumption of the processor.
@@ -149,14 +149,14 @@ a relative number. This can be expressed as percentage of nominal
149
149
performance (infrastructure max). Below the nominal sustained performance
150
150
level, desired performance expresses the average performance level of the
151
151
processor subject to hardware. Above the nominal performance level,
152
- processor must provide at least nominal performance requested and go higher
152
+ the processor must provide at least nominal performance requested and go higher
153
153
if current operating conditions allow.
154
154
155
155
Energy Performance Preference (EPP) (RW)
156
156
.........................................
157
157
158
- Provides a hint to the hardware if software wants to bias toward performance
159
- (0x0) or energy efficiency (0xff).
158
+ This attribute provides a hint to the hardware if software wants to bias
159
+ toward performance (0x0) or energy efficiency (0xff).
160
160
161
161
162
162
Key Governors Support
@@ -173,35 +173,34 @@ operating frequencies supported by the hardware. Users can check the
173
173
``amd-pstate `` mainly supports ``schedutil `` and ``ondemand `` for dynamic
174
174
frequency control. It is to fine tune the processor configuration on
175
175
``amd-pstate `` to the ``schedutil `` with CPU CFS scheduler. ``amd-pstate ``
176
- registers adjust_perf callback to implement the CPPC similar performance
177
- update behavior . It is initialized by ``sugov_start `` and then populate the
178
- CPU's update_util_data pointer to assign ``sugov_update_single_perf `` as
179
- the utilization update callback function in CPU scheduler. CPU scheduler
180
- will call ``cpufreq_update_util `` and assign the target performance
181
- according to the ``struct sugov_cpu `` that utilization update belongs to.
182
- Then ``amd-pstate `` updates the desired performance according to the CPU
176
+ registers the adjust_perf callback to implement performance update behavior
177
+ similar to CPPC . It is initialized by ``sugov_start `` and then populates the
178
+ CPU's update_util_data pointer to assign ``sugov_update_single_perf `` as the
179
+ utilization update callback function in the CPU scheduler. The CPU scheduler
180
+ will call ``cpufreq_update_util `` and assigns the target performance according
181
+ to the ``struct sugov_cpu `` that the utilization update belongs to.
182
+ Then, ``amd-pstate `` updates the desired performance according to the CPU
183
183
scheduler assigned.
184
184
185
185
186
186
Processor Support
187
187
=======================
188
188
189
- The ``amd-pstate `` initialization will fail if the _CPC in ACPI SBIOS is
190
- not existed at the detected processor, and it uses ``acpi_cpc_valid `` to
191
- check the _CPC existence. All Zen based processors support legacy ACPI
192
- hardware P-States function, so while the ``amd-pstate `` fails to be
193
- initialized, the kernel will fall back to initialize ``acpi-cpufreq ``
194
- driver.
189
+ The ``amd-pstate `` initialization will fail if the ``_CPC `` entry in the ACPI
190
+ SBIOS does not exist in the detected processor. It uses ``acpi_cpc_valid ``
191
+ to check the existence of ``_CPC ``. All Zen based processors support the legacy
192
+ ACPI hardware P-States function, so when ``amd-pstate `` fails initialization,
193
+ the kernel will fall back to initialize the ``acpi-cpufreq `` driver.
195
194
196
195
There are two types of hardware implementations for ``amd-pstate ``: one is
197
196
`Full MSR Support <perf_cap _>`_ and another is `Shared Memory Support
198
- <perf_cap_> `_. It can use :c:macro: `X86_FEATURE_CPPC ` feature flag (for
199
- details refer to Processor Programming Reference (PPR) for AMD Family
200
- 19h Model 51h, Revision A1 Processors [3 ]_) to indicate the different
201
- types. ``amd-pstate `` is to register different ``static_call `` instances
202
- for different hardware implementations.
197
+ <perf_cap_> `_. It can use the :c:macro: `X86_FEATURE_CPPC ` feature flag to
198
+ indicate the different types. (For details, refer to the Processor Programming
199
+ Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors [3 ]_.)
200
+ ``amd-pstate `` is to register different ``static_call `` instances for different
201
+ hardware implementations.
203
202
204
- Currently, some of Zen2 and Zen3 processors support ``amd-pstate ``. In the
203
+ Currently, some of the Zen2 and Zen3 processors support ``amd-pstate ``. In the
205
204
future, it will be supported on more and more AMD processors.
206
205
207
206
Full MSR Support
@@ -210,18 +209,18 @@ Full MSR Support
210
209
Some new Zen3 processors such as Cezanne provide the MSR registers directly
211
210
while the :c:macro: `X86_FEATURE_CPPC ` CPU feature flag is set.
212
211
``amd-pstate `` can handle the MSR register to implement the fast switch
213
- function in ``CPUFreq `` that can shrink latency of frequency control on the
214
- interrupt context. The functions with ``pstate_xxx `` prefix represent the
215
- operations of MSR registers.
212
+ function in ``CPUFreq `` that can reduce the latency of frequency control in
213
+ interrupt context. The functions with a ``pstate_xxx `` prefix represent the
214
+ operations on MSR registers.
216
215
217
216
Shared Memory Support
218
217
----------------------
219
218
220
- If :c:macro: `X86_FEATURE_CPPC ` CPU feature flag is not set, that means the
221
- processor supports shared memory solution. In this case, ``amd-pstate ``
219
+ If the :c:macro: `X86_FEATURE_CPPC ` CPU feature flag is not set, the
220
+ processor supports the shared memory solution. In this case, ``amd-pstate ``
222
221
uses the ``cppc_acpi `` helper methods to implement the callback functions
223
- that defined on ``static_call ``. The functions with ``cppc_xxx `` prefix
224
- represent the operations of acpi cppc helpers for shared memory solution.
222
+ that are defined on ``static_call ``. The functions with the ``cppc_xxx `` prefix
223
+ represent the operations of ACPI CPPC helpers for the shared memory solution.
225
224
226
225
227
226
AMD P-States and ACPI hardware P-States always can be supported in one
@@ -234,7 +233,7 @@ User Space Interface in ``sysfs``
234
233
==================================
235
234
236
235
``amd-pstate `` exposes several global attributes (files) in ``sysfs `` to
237
- control its functionality at the system level. They located in the
236
+ control its functionality at the system level. They are located in the
238
237
``/sys/devices/system/cpu/cpufreq/policyX/ `` directory and affect all CPUs. ::
239
238
240
239
root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd*
@@ -246,38 +245,38 @@ control its functionality at the system level. They located in the
246
245
``amd_pstate_highest_perf / amd_pstate_max_freq ``
247
246
248
247
Maximum CPPC performance and CPU frequency that the driver is allowed to
249
- set in percent of the maximum supported CPPC performance level (the highest
248
+ set, in percent of the maximum supported CPPC performance level (the highest
250
249
performance supported in `AMD CPPC Performance Capability <perf_cap _>`_).
251
- In some of ASICs, the highest CPPC performance is not the one in the _CPC
252
- table, so we need to expose it to sysfs. If boost is not active but
253
- supported, this maximum frequency will be larger than the one in
250
+ In some ASICs, the highest CPPC performance is not the one in the `` _CPC ``
251
+ table, so we need to expose it to sysfs. If boost is not active, but
252
+ still supported, this maximum frequency will be larger than the one in
254
253
``cpuinfo ``.
255
254
This attribute is read-only.
256
255
257
256
``amd_pstate_lowest_nonlinear_freq ``
258
257
259
- The lowest non-linear CPPC CPU frequency that the driver is allowed to set
260
- in percent of the maximum supported CPPC performance level (Please see the
258
+ The lowest non-linear CPPC CPU frequency that the driver is allowed to set,
259
+ in percent of the maximum supported CPPC performance level. (Please see the
261
260
lowest non-linear performance in `AMD CPPC Performance Capability
262
- <perf_cap_> `_).
261
+ <perf_cap_> `_.)
263
262
This attribute is read-only.
264
263
265
- For other performance and frequency values, we can read them back from
264
+ Other performance and frequency values can be read back from
266
265
``/sys/devices/system/cpu/cpuX/acpi_cppc/ ``, see :ref: `cppc_sysfs `.
267
266
268
267
269
268
``amd-pstate `` vs ``acpi-cpufreq ``
270
269
======================================
271
270
272
- On majority of AMD platforms supported by ``acpi-cpufreq ``, the ACPI tables
273
- provided by the platform firmware used for CPU performance scaling, but
274
- only provides 3 P-states on AMD processors.
275
- However, on modern AMD APU and CPU series, it provides the collaborative
276
- processor performance control according to ACPI protocol and customize this
277
- for AMD platforms. That is fine-grain and continuous frequency range
271
+ On the majority of AMD platforms supported by ``acpi-cpufreq ``, the ACPI tables
272
+ provided by the platform firmware are used for CPU performance scaling, but
273
+ only provide 3 P-states on AMD processors.
274
+ However, on modern AMD APU and CPU series, hardware provides the Collaborative
275
+ Processor Performance Control according to the ACPI protocol and customizes this
276
+ for AMD platforms. That is, fine-grained and continuous frequency ranges
278
277
instead of the legacy hardware P-states. ``amd-pstate `` is the kernel
279
- module which supports the new AMD P-States mechanism on most of future AMD
280
- platforms. The AMD P-States mechanism will be the more performance and energy
278
+ module which supports the new AMD P-States mechanism on most of the future AMD
279
+ platforms. The AMD P-States mechanism is the more performance and energy
281
280
efficiency frequency management method on AMD processors.
282
281
283
282
Kernel Module Options for ``amd-pstate ``
@@ -287,25 +286,25 @@ Kernel Module Options for ``amd-pstate``
287
286
Use a module param (shared_mem) to enable related processors manually with
288
287
**amd_pstate.shared_mem=1 **.
289
288
Due to the performance issue on the processors with `Shared Memory Support
290
- <perf_cap_> `_, so we disable it for the moment and will enable this by default
291
- once we address performance issue on this solution.
289
+ <perf_cap_> `_, we disable it presently and will re- enable this by default
290
+ once we address performance issue with this solution.
292
291
293
- The way to check whether current processor is `Full MSR Support <perf_cap _>`_
292
+ To check whether the current processor is using `Full MSR Support <perf_cap _>`_
294
293
or `Shared Memory Support <perf_cap _>`_ : ::
295
294
296
295
ray@hr-test1:~$ lscpu | grep cppc
297
296
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
298
297
299
- If CPU Flags have cppc, then this processor supports `Full MSR Support
300
- <perf_cap_> `_. Otherwise it supports `Shared Memory Support <perf_cap _>`_.
298
+ If the CPU flags have `` cppc `` , then this processor supports `Full MSR Support
299
+ <perf_cap_> `_. Otherwise, it supports `Shared Memory Support <perf_cap _>`_.
301
300
302
301
303
302
``cpupower `` tool support for ``amd-pstate ``
304
303
===============================================
305
304
306
- ``amd-pstate `` is supported on ``cpupower `` tool that can be used to dump the frequency
307
- information. And it is in progress to support more and more operations for new
308
- ``amd-pstate `` module with this tool. ::
305
+ ``amd-pstate `` is supported by the ``cpupower `` tool, which can be used to dump
306
+ frequency information. Development is in progress to support more and more
307
+ operations for the new ``amd-pstate `` module with this tool. ::
309
308
310
309
root@hr-test1:/home/ray# cpupower frequency-info
311
310
analyzing CPU 0:
@@ -336,10 +335,10 @@ Trace Events
336
335
--------------
337
336
338
337
There are two static trace events that can be used for ``amd-pstate ``
339
- diagnostics. One of them is the cpu_frequency trace event generally used
338
+ diagnostics. One of them is the `` cpu_frequency `` trace event generally used
340
339
by ``CPUFreq ``, and the other one is the ``amd_pstate_perf `` trace event
341
340
specific to ``amd-pstate ``. The following sequence of shell commands can
342
- be used to enable them and see their output (if the kernel is generally
341
+ be used to enable them and see their output (if the kernel is
343
342
configured to support event tracing). ::
344
343
345
344
root@hr-test1:/home/ray# cd /sys/kernel/tracing/
@@ -364,7 +363,7 @@ configured to support event tracing). ::
364
363
<idle>-0 [003] d.s.. 4995.980971: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=3 changed=false fast_switch=true
365
364
<idle>-0 [011] d.s.. 4995.980996: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=11 changed=false fast_switch=true
366
365
367
- The cpu_frequency trace event will be triggered either by the ``schedutil `` scaling
366
+ The `` cpu_frequency `` trace event will be triggered either by the ``schedutil `` scaling
368
367
governor (for the policies it is attached to), or by the ``CPUFreq `` core (for the
369
368
policies with other scaling governors).
370
369
0 commit comments