|
| 1 | ++++ |
| 2 | +title = "Multi-version drivers" |
| 3 | ++++ |
| 4 | + |
| 5 | +Linux loads device drivers on boot and every device driver exists in one |
| 6 | +version. XAPI extends this scheme such that device drivers may exist in |
| 7 | +multiple variants plus a mechanism to select the variant being loaded on |
| 8 | +boot. Such a driver is called a multi-version driver and we expect only |
| 9 | +a small subset of drivers, built and distributed by XenServer, to have |
| 10 | +this property. The following covers the background, API, and CLI for |
| 11 | +multi-version drivers in XAPI. |
| 12 | + |
| 13 | +## Variant vs. Version |
| 14 | + |
| 15 | +A driver comes in several variants, each of which has a version. A |
| 16 | +variant may be updated to a later version while retaining its identity. |
| 17 | +This makes variants and versions somewhat synonymous and is admittedly |
| 18 | +confusing. |
| 19 | + |
| 20 | +## Device Drivers in Linux and XAPI |
| 21 | + |
| 22 | +Drivers that are not compiled into the kernel are loaded dynamically |
| 23 | +from the file system. They are loaded from the hierarchy |
| 24 | + |
| 25 | +* `/lib/modules/<kernel-version>/` |
| 26 | + |
| 27 | +and we are particularly interested in the hierarchy |
| 28 | + |
| 29 | +* `/lib/modules/<kernel-version>/updates/` |
| 30 | + |
| 31 | +where vendor-supplied ("driver disk") drivers are located and where we |
| 32 | +want to support multiple versions. A driver has typically file extension |
| 33 | +`.ko` (kernel object). |
| 34 | + |
| 35 | +A presence in the file system does not mean that a driver is loaded as |
| 36 | +this happens only on demand. The actually loaded drivers |
| 37 | +(or modules, in Linux parlance) can be observed from |
| 38 | + |
| 39 | +* `/proc/modules` |
| 40 | + |
| 41 | +``` |
| 42 | +netlink_diag 16384 0 - Live 0x0000000000000000 |
| 43 | +udp_diag 16384 0 - Live 0x0000000000000000 |
| 44 | +tcp_diag 16384 0 - Live 0x0000000000000000 |
| 45 | +``` |
| 46 | + |
| 47 | +which includes dependencies between modules (the `-` means no dependencies). |
| 48 | + |
| 49 | +## Driver Properties |
| 50 | + |
| 51 | +* A driver name is unique and a driver can be loaded only once. The fact |
| 52 | + that kernel object files are located in a file system hierarchy means |
| 53 | + that a driver may exist multiple times and in different version in the |
| 54 | + file system. From the kernel's perspective a driver has a unique name |
| 55 | + and is loaded at most once. We thus can talk about a driver using its |
| 56 | + name and acknowledge it may exist in different versions in the file |
| 57 | + system. |
| 58 | + |
| 59 | +* A driver that is loaded by the kernel we call *active*. |
| 60 | + |
| 61 | +* A driver file (`name.ko`) that is in a hierarchy searched by the |
| 62 | + kernel is called *selected*. If the kernel needs the driver of that |
| 63 | + name, it would load this object file. |
| 64 | + |
| 65 | +For a driver (`name.ko`) selection and activation are independent |
| 66 | +properties: |
| 67 | + |
| 68 | +* *inactive*, *deselected*: not loaded now and won't be loaded on next |
| 69 | + boot. |
| 70 | +* *active*, *deselected*: currently loaded but won't be loaded on next |
| 71 | + boot. |
| 72 | +* *inactive*, *selected*: not loaded now but will be loaded on demand. |
| 73 | +* *active*, *selected*: currently loaded and will be loaded on demand |
| 74 | + after a reboot. |
| 75 | + |
| 76 | +For a driver to be selected it needs to be in the hierarchy searched by |
| 77 | +the kernel. By removing a driver from the hierarchy it can be |
| 78 | +de-selected. This is possible even for drivers that are already loaded. |
| 79 | +Hence, activation and selection are independent. |
| 80 | + |
| 81 | +## Multi-Version Drivers |
| 82 | + |
| 83 | +To support multi-version drivers, XenServer introduces a new |
| 84 | +hierarchy in Dom0. This is mostly technical background because a |
| 85 | +lower-level tool deals with this and not XAPI directly. |
| 86 | + |
| 87 | +* `/lib/modules/<kernel-version>/updates/` is searched by the kernel for |
| 88 | + drivers. |
| 89 | +* The hierarchy is expected to contain symbolic links to the file |
| 90 | + actually containing the driver: |
| 91 | + `/lib/modules/<kernel-version>/xenserver/<driver>/<version>/<name>.ko` |
| 92 | + |
| 93 | +The `xenserver` hierarchy provides drivers in several versions. To |
| 94 | +select a particular version, we expect a symbolic link from |
| 95 | +`updates/<name>.ko` to `<driver>/<version>/<name>.ko`. At the next boot, |
| 96 | +the kernel will search the `updates/` entries and load the linked |
| 97 | +driver, which will become active. |
| 98 | + |
| 99 | +Example filesystem hierarchy: |
| 100 | +``` |
| 101 | +/lib/ |
| 102 | +└── modules |
| 103 | + └── 4.19.0+1 -> |
| 104 | + ├── updates |
| 105 | + │ ├── aacraid.ko |
| 106 | + │ ├── bnx2fc.ko -> ../xenserver/bnx2fc/2.12.13/bnx2fc.ko |
| 107 | + │ ├── bnx2i.ko |
| 108 | + │ ├── cxgb4i.ko |
| 109 | + │ ├── cxgb4.ko |
| 110 | + │ ├── dell_laptop.ko -> ../xenserver/dell_laptop/1.2.3/dell_laptop.ko |
| 111 | + │ ├── e1000e.ko |
| 112 | + │ ├── i40e.ko |
| 113 | + │ ├── ice.ko -> ../xenserver/intel-ice/1.11.17.1/ice.ko |
| 114 | + │ ├── igb.ko |
| 115 | + │ ├── smartpqi.ko |
| 116 | + │ └── tcm_qla2xxx.ko |
| 117 | + └── xenserver |
| 118 | + ├── bnx2fc |
| 119 | + │ ├── 2.12.13 |
| 120 | + │ │ └── bnx2fc.ko |
| 121 | + │ └── 2.12.20-dell |
| 122 | + │ └── bnx2fc.ko |
| 123 | + ├── dell_laptop |
| 124 | + │ └── 1.2.3 |
| 125 | + │ └── dell_laptop.ko |
| 126 | + └── intel-ice |
| 127 | + ├── 1.11.17.1 |
| 128 | + │ └── ice.ko |
| 129 | + └── 1.6.4 |
| 130 | + └── ice.ko |
| 131 | +
|
| 132 | +``` |
| 133 | + |
| 134 | +Selection of a driver is synonymous with creating a symbolic link to the |
| 135 | +desired version. |
| 136 | + |
| 137 | +## Versions |
| 138 | + |
| 139 | +The version of a driver is encoded in the path to its object file but |
| 140 | +not in the name itself: for `xenserver/intel-ice/1.11.17.1/ice.ko` the |
| 141 | +driver name is `ice` and only its location hints at the version. |
| 142 | + |
| 143 | +The kernel does not reveal the location from where it loaded an active |
| 144 | +driver. Hence the name is not sufficient to observe the currently active |
| 145 | +version. For this, we use [ELF notes]. |
| 146 | + |
| 147 | +The driver file (`name.ko`) is in ELF linker format and may contain |
| 148 | +custom [ELF notes]. These are binary annotations that can be compiled |
| 149 | +into the file. The kernel reveals these details for loaded drivers |
| 150 | +(i.e., modules) in: |
| 151 | + |
| 152 | +* `/sys/module/<name>/notes/` |
| 153 | + |
| 154 | +The directory contains files like |
| 155 | + |
| 156 | +* `/sys/module/xfs/notes/.note.gnu.build-id` |
| 157 | + |
| 158 | +with a specific name (`.note.xenserver`) for our purpose. Such a file contains |
| 159 | +in binary encoding a sequence of records, each containing: |
| 160 | + |
| 161 | +* A null-terminated name (string) |
| 162 | +* A type (integer) |
| 163 | +* A desc (see below) |
| 164 | + |
| 165 | +The format of the description is vendor specific and is used for |
| 166 | +a null-terminated string holding the version. The name is fixed to |
| 167 | +"XenServer". The exact format is described in [ELF notes]. |
| 168 | + |
| 169 | +A note with the name "XenServer" and a particular type then has the version |
| 170 | +as a null-terminated string the `desc` field. Additional "XenServer" notes |
| 171 | +of a different type may be present. |
| 172 | + |
| 173 | +[ELF notes]: https://www.netbsd.org/docs/kernel/elf-notes.html |
| 174 | + |
| 175 | +## API |
| 176 | + |
| 177 | +XAPI has capabilities to inspect and select multi-version drivers. |
| 178 | + |
| 179 | +The API uses the terminology introduced above: |
| 180 | + |
| 181 | +* A driver is specific to a host. |
| 182 | +* A driver has a unique name; however, for API purposes a driver is |
| 183 | + identified by a UUID (on the CLI) and reference (programmatically). |
| 184 | +* A driver has multiple variants; each variant has a version. |
| 185 | + Programatically, variants are represented as objects (referenced by |
| 186 | + UUID and a reference) but this is mostly hidden in the CLI for |
| 187 | + convenience. |
| 188 | +* A driver variant is active if it is currently used by the kernel |
| 189 | + (loaded). |
| 190 | +* A driver variant is selected if it will be considered by the kernel |
| 191 | + (on next boot or when loading on demand). |
| 192 | +* Only one variant can be active, and only one variants can be selected. |
| 193 | + |
| 194 | +Inspection and selection of drivers is facilitated by a tool |
| 195 | +("drivertool") that is called by xapi. Hence, XAPI does not by itself |
| 196 | +manipulate the file system that implements driver selection. |
| 197 | + |
| 198 | +An example interaction with the API through xe: |
| 199 | + |
| 200 | +``` |
| 201 | +[root@lcy2-dt110 log]# xe hostdriver-list uuid=c0fe459d-5f8a-3fb1-3fe5-3c602fafecc0 params=all |
| 202 | +uuid ( RO) : c0fe459d-5f8a-3fb1-3fe5-3c602fafecc0 |
| 203 | + name ( RO): cisco-fnic |
| 204 | + type ( RO): network |
| 205 | + description ( RO): cisco-fnic |
| 206 | + info ( RO): cisco-fnic |
| 207 | + host-uuid ( RO): 6de288e7-0f82-4563-b071-bcdc083b0ffd |
| 208 | + active-variant ( RO): <none> |
| 209 | + selected-variant ( RO): <none> |
| 210 | + variants ( RO): generic/1.2 |
| 211 | + variants-dev-status ( RO): generic=beta |
| 212 | + variants-uuid ( RO): generic=abf5997b-f2ad-c0ef-b27f-3f8a37bf58a6 |
| 213 | + variants-hw-present ( RO): |
| 214 | +``` |
| 215 | + |
| 216 | +Selection of a variant by name (which is unique per driver); this |
| 217 | +variant would become active after reboot. |
| 218 | + |
| 219 | +``` |
| 220 | +[root@lcy2-dt110 log]# xe hostdriver-select variant-name=generic uuid=c0fe459d-5f8a-3fb1-3fe5-3c602fafecc0 |
| 221 | +[root@lcy2-dt110 log]# xe hostdriver-list uuid=c0fe459d-5f8a-3fb1-3fe5-3c602fafecc0 params=all |
| 222 | +uuid ( RO) : c0fe459d-5f8a-3fb1-3fe5-3c602fafecc0 |
| 223 | + name ( RO): cisco-fnic |
| 224 | + type ( RO): network |
| 225 | + description ( RO): cisco-fnic |
| 226 | + info ( RO): cisco-fnic |
| 227 | + host-uuid ( RO): 6de288e7-0f82-4563-b071-bcdc083b0ffd |
| 228 | + active-variant ( RO): <none> |
| 229 | + selected-variant ( RO): generic |
| 230 | + variants ( RO): generic/1.2 |
| 231 | + variants-dev-status ( RO): generic=beta |
| 232 | + variants-uuid ( RO): generic=abf5997b-f2ad-c0ef-b27f-3f8a37bf58a6 |
| 233 | + variants-hw-present ( RO): |
| 234 | +``` |
| 235 | + |
| 236 | +The variant can be inspected, too, using it's UUID. |
| 237 | + |
| 238 | +``` |
| 239 | +[root@lcy2-dt110 log]# xe hostdriver-variant-list uuid=abf5997b-f2ad-c0ef-b27f-3f8a37bf58a6 |
| 240 | +uuid ( RO) : abf5997b-f2ad-c0ef-b27f-3f8a37bf58a6 |
| 241 | + name ( RO): generic |
| 242 | + version ( RO): 1.2 |
| 243 | + status ( RO): beta |
| 244 | + active ( RO): false |
| 245 | + selected ( RO): true |
| 246 | + driver-uuid ( RO): c0fe459d-5f8a-3fb1-3fe5-3c602fafecc0 |
| 247 | + driver-name ( RO): cisco-fnic |
| 248 | + host-uuid ( RO): 6de288e7-0f82-4563-b071-bcdc083b0ffd |
| 249 | + hw-present ( RO): false |
| 250 | +``` |
| 251 | + |
| 252 | +## Class Host_driver |
| 253 | + |
| 254 | +Class `Host_driver` represents an instance of a multi-version driver on |
| 255 | +a host. It references `Driver_variant` objects for the details of the |
| 256 | +available and active variants. A variant has a version. |
| 257 | + |
| 258 | +### Fields |
| 259 | + |
| 260 | +All fields are read-only and can't be set directly. Be aware that names |
| 261 | +in the CLI and the API may differ. |
| 262 | + |
| 263 | +* `host`: reference to the host where the driver is installed. |
| 264 | +* `name`: string; name of the driver without ".ko" extension. |
| 265 | +* `variants`: string set; set of variants available on the host for this |
| 266 | + driver. The name of each variant of a driver is unique and used in |
| 267 | + the CLI for selecting it. |
| 268 | +* `selected_varinat`: variant, possibly empty. Variant that is selected, |
| 269 | + i.e. the variant of the driver that will be considered by the kernel |
| 270 | + when loading the driver the next time. May be null when none is |
| 271 | + selected. |
| 272 | +* `active_variant`: variant, possibly empty. Variant that is currently |
| 273 | + loaded by the kernel. |
| 274 | +* `type`, `info`, `description`: strings providing background |
| 275 | + information. |
| 276 | + |
| 277 | +The CLI uses `hostdriver` and a dash instead of an underscore. The CLI |
| 278 | +also offers convenience fields. Whenever selected and |
| 279 | +active variant are not the same, a reboot is required to activate the |
| 280 | +selected driver/variant combination. |
| 281 | + |
| 282 | +(We are not using `host-driver` in the CLI to avoid the impression that |
| 283 | +this is part of a host object.) |
| 284 | + |
| 285 | +### Methods |
| 286 | + |
| 287 | +* All method invocations require `Pool_Operator` rights. "The Pool |
| 288 | + Operator role manages host- and pool-wide resources, including setting |
| 289 | + up storage, creating resource pools and managing patches, high |
| 290 | + availability (HA) and workload balancing (WLB)" |
| 291 | + |
| 292 | +* `select (self, variant)`; select `variant` of driver `self`. Selecting |
| 293 | + the variant (a reference) of an existing driver. |
| 294 | + |
| 295 | +* `deselect(self)`: this driver can't be loaded next time the kernel is |
| 296 | + looking for a driver. This is a potentially dangerous operation, so it's |
| 297 | + protected in the CLI with a `--force` flag. |
| 298 | + |
| 299 | +* `rescan (host)`: scan the host and update its driver information. |
| 300 | + Called on toolstack restart and may be invoked from the CLI for |
| 301 | + development. |
| 302 | + |
| 303 | +## Class `Driver_variant` |
| 304 | + |
| 305 | +An object of this class represents a variant of a driver on a host, |
| 306 | +i.e., it is specific to both. |
| 307 | + |
| 308 | +* `name`: unique name |
| 309 | +* `driver`: what host driver this belongs to |
| 310 | +* `version`: string; a driver variant has a version |
| 311 | +* `status`: string: development status, like "beta" |
| 312 | +* `hardware_present`: boolean, true if the host has the hardware |
| 313 | + installed supported by this driver |
| 314 | + |
| 315 | +The only method available is `select(self)` to select a variant. It has |
| 316 | +the same effect as the `select` method on the `Host_driver` class. |
| 317 | + |
| 318 | +The CLI comes with corresponding `xe hostdriver-variant-*` commands to |
| 319 | +list and select a variant. |
| 320 | + |
| 321 | +``` |
| 322 | +[root@lcy2-dt110 log]# xe hostdriver-variant-list uuid=abf5997b-f2ad-c0ef-b27f-3f8a37bf58a6 |
| 323 | +uuid ( RO) : abf5997b-f2ad-c0ef-b27f-3f8a37bf58a6 |
| 324 | + name ( RO): generic |
| 325 | + version ( RO): 1.2 |
| 326 | + status ( RO): beta |
| 327 | + active ( RO): false |
| 328 | + selected ( RO): true |
| 329 | + driver-uuid ( RO): c0fe459d-5f8a-3fb1-3fe5-3c602fafecc0 |
| 330 | + driver-name ( RO): cisco-fnic |
| 331 | + host-uuid ( RO): 6de288e7-0f82-4563-b071-bcdc083b0ffd |
| 332 | + hw-present ( RO): false |
| 333 | +``` |
| 334 | + |
| 335 | +### Database |
| 336 | + |
| 337 | +Each `Host_driver` and `Driver_variant` object is represented in the |
| 338 | +database and data is persisted over reboots. This means this data will |
| 339 | +be part of data collected in a `xen-bugtool` invocation. |
| 340 | + |
| 341 | +### Scan and Rescan |
| 342 | + |
| 343 | +On XAPI start-up, XAPI updates the `Host_driver` objects belonging to the |
| 344 | +host to reflect the actual situation. This can be initiated from the |
| 345 | +CLI, too, mostly for development. |
| 346 | + |
| 347 | + |
0 commit comments