@@ -6,6 +6,7 @@ The Linux Microcode Loader
6
6
7
7
:Authors: - Fenghua Yu <fenghua.yu@intel.com>
8
8
- Borislav Petkov <bp@suse.de>
9
+ - Ashok Raj <ashok.raj@intel.com>
9
10
10
11
The kernel has a x86 microcode loading facility which is supposed to
11
12
provide microcode loading methods in the OS. Potential use cases are
@@ -92,15 +93,8 @@ vendor's site.
92
93
Late loading
93
94
============
94
95
95
- There are two legacy user space interfaces to load microcode, either through
96
- /dev/cpu/microcode or through /sys/devices/system/cpu/microcode/reload file
97
- in sysfs.
98
-
99
- The /dev/cpu/microcode method is deprecated because it needs a special
100
- userspace tool for that.
101
-
102
- The easier method is simply installing the microcode packages your distro
103
- supplies and running::
96
+ You simply install the microcode packages your distro supplies and
97
+ run::
104
98
105
99
# echo 1 > /sys/devices/system/cpu/microcode/reload
106
100
@@ -110,6 +104,110 @@ The loading mechanism looks for microcode blobs in
110
104
/lib/firmware/{intel-ucode,amd-ucode}. The default distro installation
111
105
packages already put them there.
112
106
107
+ Since kernel 5.19, late loading is not enabled by default.
108
+
109
+ The /dev/cpu/microcode method has been removed in 5.19.
110
+
111
+ Why is late loading dangerous?
112
+ ==============================
113
+
114
+ Synchronizing all CPUs
115
+ ----------------------
116
+
117
+ The microcode engine which receives the microcode update is shared
118
+ between the two logical threads in a SMT system. Therefore, when
119
+ the update is executed on one SMT thread of the core, the sibling
120
+ "automatically" gets the update.
121
+
122
+ Since the microcode can "simulate" MSRs too, while the microcode update
123
+ is in progress, those simulated MSRs transiently cease to exist. This
124
+ can result in unpredictable results if the SMT sibling thread happens to
125
+ be in the middle of an access to such an MSR. The usual observation is
126
+ that such MSR accesses cause #GPs to be raised to signal that former are
127
+ not present.
128
+
129
+ The disappearing MSRs are just one common issue which is being observed.
130
+ Any other instruction that's being patched and gets concurrently
131
+ executed by the other SMT sibling, can also result in similar,
132
+ unpredictable behavior.
133
+
134
+ To eliminate this case, a stop_machine()-based CPU synchronization was
135
+ introduced as a way to guarantee that all logical CPUs will not execute
136
+ any code but just wait in a spin loop, polling an atomic variable.
137
+
138
+ While this took care of device or external interrupts, IPIs including
139
+ LVT ones, such as CMCI etc, it cannot address other special interrupts
140
+ that can't be shut off. Those are Machine Check (#MC), System Management
141
+ (#SMI) and Non-Maskable interrupts (#NMI).
142
+
143
+ Machine Checks
144
+ --------------
145
+
146
+ Machine Checks (#MC) are non-maskable. There are two kinds of MCEs.
147
+ Fatal un-recoverable MCEs and recoverable MCEs. While un-recoverable
148
+ errors are fatal, recoverable errors can also happen in kernel context
149
+ are also treated as fatal by the kernel.
150
+
151
+ On certain Intel machines, MCEs are also broadcast to all threads in a
152
+ system. If one thread is in the middle of executing WRMSR, a MCE will be
153
+ taken at the end of the flow. Either way, they will wait for the thread
154
+ performing the wrmsr(0x79) to rendezvous in the MCE handler and shutdown
155
+ eventually if any of the threads in the system fail to check in to the
156
+ MCE rendezvous.
157
+
158
+ To be paranoid and get predictable behavior, the OS can choose to set
159
+ MCG_STATUS.MCIP. Since MCEs can be at most one in a system, if an
160
+ MCE was signaled, the above condition will promote to a system reset
161
+ automatically. OS can turn off MCIP at the end of the update for that
162
+ core.
163
+
164
+ System Management Interrupt
165
+ ---------------------------
166
+
167
+ SMIs are also broadcast to all CPUs in the platform. Microcode update
168
+ requests exclusive access to the core before writing to MSR 0x79. So if
169
+ it does happen such that, one thread is in WRMSR flow, and the 2nd got
170
+ an SMI, that thread will be stopped in the first instruction in the SMI
171
+ handler.
172
+
173
+ Since the secondary thread is stopped in the first instruction in SMI,
174
+ there is very little chance that it would be in the middle of executing
175
+ an instruction being patched. Plus OS has no way to stop SMIs from
176
+ happening.
177
+
178
+ Non-Maskable Interrupts
179
+ -----------------------
180
+
181
+ When thread0 of a core is doing the microcode update, if thread1 is
182
+ pulled into NMI, that can cause unpredictable behavior due to the
183
+ reasons above.
184
+
185
+ OS can choose a variety of methods to avoid running into this situation.
186
+
187
+
188
+ Is the microcode suitable for late loading?
189
+ -------------------------------------------
190
+
191
+ Late loading is done when the system is fully operational and running
192
+ real workloads. Late loading behavior depends on what the base patch on
193
+ the CPU is before upgrading to the new patch.
194
+
195
+ This is true for Intel CPUs.
196
+
197
+ Consider, for example, a CPU has patch level 1 and the update is to
198
+ patch level 3.
199
+
200
+ Between patch1 and patch3, patch2 might have deprecated a software-visible
201
+ feature.
202
+
203
+ This is unacceptable if software is even potentially using that feature.
204
+ For instance, say MSR_X is no longer available after an update,
205
+ accessing that MSR will cause a #GP fault.
206
+
207
+ Basically there is no way to declare a new microcode update suitable
208
+ for late-loading. This is another one of the problems that caused late
209
+ loading to be not enabled by default.
210
+
113
211
Builtin microcode
114
212
=================
115
213
0 commit comments