Skip to content

Commit adef6fa

Browse files
committed
Fixing #1009 Doc upgrade/downgrade policy
[skip ci]
1 parent a4def13 commit adef6fa

File tree

4 files changed

+363
-2
lines changed

4 files changed

+363
-2
lines changed

doc/boot.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -349,5 +349,6 @@ can funtion reasonably well without a persistent `/var`, loosing
349349
If `var` is not available, Infix will still persist `/var/lib` using
350350
`cfg` as the backing storage.
351351

352-
[^1]: See [CLI Upgrade](cli/upgrade.md) for information on upgrading
353-
via CLI.
352+
[^1]: See [Upgrading procedures and boot
353+
order](system.md#upgrade-procedures-and-boot-order) for
354+
information on upgrading via CLI.

doc/cli/upgrade.md

+4
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,11 @@ The secondary partition (`rootfs.1`) has now been upgraded and will be used as
3939
the *active* partition on the next boot. Leaving the primary partition, with
4040
the version we are currently running, intact in case of trouble.
4141

42+
See [upgrading procedures and boot order][2] for more information on
43+
upgrading.
44+
4245
[^1]: It is not possible to upgrade the partition we booted from. Thankfully
4346
the underlying "rauc" subsystem keeps track of this. Hence, to upgrade
4447
both partitions you must reboot to the new version (to verify it works)
4548
and then repeat the same command.
49+
[2]: system.md#upgrade-procedures-and-boot-order

doc/management.md

+8
Original file line numberDiff line numberDiff line change
@@ -193,3 +193,11 @@ admin@example:/config/web/> edit restconf
193193
admin@example:/config/web/restconf/> no enabled
194194
admin@example:/config/web/restconf/>
195195
```
196+
197+
# System Upgrade
198+
199+
See [upgrading procedures and boot order][1] for information on
200+
upgrading.
201+
202+
203+
[1]: system.md#upgrade-procedures-and-boot-order

doc/system.md

+348
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,355 @@ reference ID, stratum, time offsets, frequency, and root delay.
323323
> The system uses `chronyd` Network Time Protocol (NTP) daemon. The
324324
> output shown here is best explained in the [Chrony documentation][4].
325325
326+
## Upgrade procedures and boot order
327+
328+
For resilience purposes, Infix maintains two software
329+
images referred to as the _primary_ and _secondary_ partition image.
330+
In addition, some bootloaders support [netbooting][6].
331+
332+
The _boot order_ defines which image is tried first, and is listed
333+
with the CLI `show software` command. It also shows Infix version
334+
installed per partition, and which image was used when booting (`STATE
335+
booted`).
336+
337+
```
338+
admin@example:/> show software
339+
BOOT ORDER
340+
primary secondary net
341+
342+
NAME STATE VERSION DATE
343+
primary booted v25.01.0 2025-04-25T10:15:00+00:00
344+
secondary inactive v25.01.0 2025-04-25T10:07:20+00:00
345+
admin@example:/>
346+
```
347+
348+
YANG support for upgrading Infix, inspecting and _modifying_ the
349+
boot-order, is defined in [infix-system-software][5].
350+
351+
352+
### Upgrading Infix
353+
354+
Upgrading Infix is done one partition at a time. If the system has
355+
booted from one partition, an `upgrade` will apply to the other
356+
(inactive) partition.
357+
358+
1. Download and unpack the release to install. Make the image *pkg*
359+
bundle available at some suitable URL (FTP/TFTP/SFTP/HTTP/HTTPS).
360+
1. Assume the unit has booted the `primary` image. Then running the
361+
`upgrade` command installs a new image on the `secondary`
362+
partition.
363+
1. As part of a successful upgrade, the boot-order is implictly
364+
changed to boot the newly installed image.
365+
1. Reboot the unit.
366+
1. During boot, the unit may [migrate](#configuration-migration) the
367+
startup configuration inline with newer configuration definitions.
368+
1. The unit now runs the new image. To upgrade the remaining partition
369+
(`primary`), run the same upgrade command again, and reboot.
370+
371+
The CLI example below shows steps 2-4.
372+
373+
Upgrade: here the image *pkg bundle* was made available via TFTP.
374+
375+
```
376+
admin@example:/> upgrade tftp://198.18.117.1/infix-aarch64-25.03.1.pkg
377+
installing
378+
0% Installing
379+
0% Determining slot states
380+
10% Determining slot states done.
381+
...
382+
98% Copying image to rootfs.1
383+
99% Copying image to rootfs.1
384+
99% Copying image to rootfs.1 done.
385+
99% Updating slots done.
386+
100% Installing done.
387+
Installing `tftp://198.18.117.1/infix-aarch64-25.03.1.pkg` succeeded
388+
admin@example:/>
389+
```
390+
391+
Reboot: The unit will boot on the other partition, with the newly
392+
installed image. The `Loading startup-config` step conducts migration
393+
of startup configuration if applicable.
394+
395+
```
396+
admin@example:/> reboot
397+
[ OK ] Stopping Static routing daemon
398+
[ OK ] Stopping Zebra routing daemon
399+
...
400+
[ OK ] Loading startup-config
401+
[ OK ] Verifying self-signed https certificate
402+
[ OK ] Update DNS configuration
403+
[ OK ] Starting Status daemon
404+
405+
Infix -- a Network Operating System v25.03.1 (ttyS0)
406+
example login: admin
407+
Password:
408+
.-------.
409+
| . . | Infix -- a Network Operating System
410+
|-. v .-| https://kernelkit.org
411+
'-'---'-'
412+
413+
Run the command 'cli' for interactive OAM
414+
415+
admin@example:~$ cli
416+
417+
See the 'help' command for an introduction to the system
418+
419+
admin@example:/> show software
420+
BOOT ORDER
421+
secondary primary net
422+
423+
NAME STATE VERSION DATE
424+
primary inactive v25.01.0 2025-04-25T10:15:00+00:00
425+
secondary booted v25.03.1 2025-04-25T10:24:31+00:00
426+
admin@example:/>
427+
```
428+
429+
As shown, the *boot order* has been updated primary and secondary
430+
partition, so that the secondary is the first choice.
431+
432+
To upgrade the remaining partition (`primary`), run the `upgrade URL`
433+
command again, and reboot.
434+
435+
### Configuration Migration
436+
437+
The example above illustrated an upgrade from Infix v25.01.0 to
438+
v25.03.1. Inbetween these versions, YANG configuration definitions
439+
changed slightly (more details given below).
440+
441+
During boot, Infix inspects the `version` meta information within the
442+
startup configuration file to determine if configuration migration is
443+
needed. In this specific case, the configuration file has version
444+
`1.4` while the booted software expects version `1.5` (the
445+
configuration version numbering differs from the Infix image version
446+
numbering). The startup configuration is migrated to `1.5`
447+
definitions and stored, while a backup previous startup configuration
448+
is stored in directory `/cfg/backup/`.
449+
450+
```
451+
admin@example:/> dir /cfg/backup/
452+
/cfg/backup/ directory
453+
startup-config-1.4.cfg
454+
455+
admin@example:/>
456+
```
457+
458+
The modifications made to the startup configuration can be viewed by
459+
comparing the files from the *shell*. An example is shown below.
460+
461+
```
462+
admin@example:/> exit
463+
admin@example:~$ diff /cfg/backup/startup-config-1.4.cfg /cfg/startup-config.cfg
464+
--- /cfg/backup/startup-config-1.4.cfg
465+
+++ /cfg/startup-config.cfg
466+
...
467+
- "public-key-format": "ietf-crypto-types:ssh-public-key-format",
468+
+ "public-key-format": "infix-crypto-types:ssh-public-key-format",
469+
...
470+
- "private-key-format": "ietf-crypto-types:rsa-private-key-format",
471+
+ "private-key-format": "infix-crypto-types:rsa-private-key-format",
472+
...
473+
- "version": "1.4"
474+
+ "version": "1.5"
475+
...
476+
admin@example:~$
477+
```
478+
479+
### Downgrading Infix
480+
481+
Downgrading to an earlier Infix version is possible, however,
482+
downgrading is **not** guaranteed to work smoothly. In particular,
483+
when the unit boots up with the downgraded version, it may fail to
484+
apply the *startup config*, and instead apply its [failure config][7].
485+
486+
We consider two cases: downgrading with or without applying a backup
487+
startup configuration before rebooting.
488+
489+
In both cases we start out with a unit running Infix v25.03.1, and
490+
wish to downgrade to v25.01.0.
491+
492+
```
493+
admin@example:/> show software
494+
BOOT ORDER
495+
primary secondary net
496+
497+
NAME STATE VERSION DATE
498+
primary booted v25.03.1 2025-04-25T11:36:26+00:00
499+
secondary inactive v25.03.1 2025-04-25T10:24:31+00:00
500+
admin@example:/>
501+
```
502+
503+
#### Downgrading when applying a backup startup configuration
504+
505+
This is the recommended approach to downgrade, given that you have a
506+
backup configuration available. By restoring the backup
507+
configuration, ending up in failure-config after downgrading can be
508+
avoided.
509+
510+
1. Find the backup configuration file.
511+
1. Run `upgrade URL` to install Infix image to downgrade to.
512+
1. Copy backup startup configuration to current startup configuration
513+
(from shell).
514+
1. Reboot.
515+
516+
517+
*Find the backup configuration file:*
518+
519+
Assume you have a backup startup config for the version to downgrade
520+
to (here Infix v25.01.0, config `version 1.4`).
521+
522+
```
523+
admin@example:/> dir /cfg/backup/
524+
/cfg/backup/ directory
525+
startup-config-1.4.cfg
526+
527+
admin@example:/>
528+
```
529+
530+
*Use `upgrade` command to downgrade:*
531+
532+
```
533+
admin@example:/> upgrade tftp://198.18.117.1/infix-aarch64-25.01.0.pkg
534+
installing
535+
0% Installing
536+
0% Determining slot states
537+
10% Determining slot states done.
538+
...
539+
99% Copying image to rootfs.1 done.
540+
99% Updating slots done.
541+
100% Installing done.
542+
Installing `tftp://198.18.117.1/infix-aarch64-25.01.0.pkg` succeeded
543+
admin@example:/>
544+
```
545+
546+
*Apply the backup configuration file:*
547+
548+
```
549+
admin@example:/> copy /cfg/backup/startup-config-1.4.cfg /cfg/startup-config.cfg
550+
Overwrite existing file /cfg/startup-config.cfg (y/N)? y
551+
admin@example:/>
552+
```
553+
554+
*Reboot:*
555+
556+
The unit will come up with the applied backup configuration instead of
557+
failure config.
558+
559+
```
560+
admin@example:/> reboot
561+
[ OK ] Saving system clock to file
562+
[ OK ] Stopping Software update service
563+
[ OK ] Stopping Status daemon
564+
...
565+
[ OK ] Bootstrapping YANG datastore
566+
[ OK ] Starting Configuration daemon
567+
[ OK ] Loading startup-config
568+
[ OK ] Update DNS configuration
569+
[ OK ] Verifying self-signed https certificate
570+
[ OK ] Starting Status daemon
571+
572+
Infix -- a Network Operating System v25.01.0 (ttyS0)
573+
example login:
574+
```
575+
576+
#### Downgrading without applying a backup startup configuration
577+
578+
This procedure assumes you have access to the unit's console port and
579+
its default login credentials[^9].
580+
581+
1. Downgrade
582+
1. Reboot
583+
1. Login with unit's default credentials
584+
1. Conduct factory reset
585+
1. (Then go on configure the unit as you wish)
586+
587+
*Use `upgrade` command to downgrade:*
588+
589+
```
590+
admin@example:/> upgrade tftp://198.18.117.1/infix-aarch64-25.01.0.pkg
591+
installing
592+
0% Installing
593+
0% Determining slot states
594+
10% Determining slot states done.
595+
...
596+
99% Copying image to rootfs.1 done.
597+
99% Updating slots done.
598+
100% Installing done.
599+
Installing `tftp://198.18.117.1/infix-aarch64-25.01.0.pkg` succeeded
600+
admin@example:/>
601+
```
602+
603+
*Reboot:*
604+
605+
Conduct a reboot. During boot, the unit is fails to apply the existing
606+
startup configuration (config version `1.5` while software expects
607+
version `1.4` or earlier), and instead applies its [failure
608+
config][7]. This is what is seen on the console when this situation
609+
occurs. Note that the login prompt displays `failed` as part of the
610+
*hostname*.
611+
612+
```
613+
admin@example:/> reboot
614+
[ OK ] Saving system clock to file
615+
[ OK ] Stopping Software update service
616+
[ OK ] Stopping Status daemon
617+
...
618+
[ OK ] Verifying SSH host keys
619+
[ OK ] Bootstrapping YANG datastore
620+
[ OK ] Starting Configuration daemon
621+
[FAIL] Loading startup-config
622+
[ OK ] Loading failure-config
623+
[ OK ] Verifying self-signed https certificate
624+
[ OK ] Starting Status daemon
625+
626+
Infix -- a Network Operating System v25.01.0 (ttyS0)
627+
628+
ERROR: Corrupt startup-config, system has reverted to default login credentials
629+
failed-00-00-00 login:
630+
```
631+
632+
To remedy a situation like this, you can login with the unit's *default
633+
login credentials*, preferrably via a [console port][8], and then
634+
conduct a factory reset[^9].
635+
The unit's default credentials are typically printed on a sticker on
636+
the unit.
637+
638+
```
639+
failed-00-00-00 login: admin
640+
Password:
641+
642+
Run the command 'cli' for interactive OAM
643+
644+
admin@failed-00-00-00:~$
645+
admin@failed-00-00-00:~$ factory
646+
Factory reset device (y/N)? y
647+
factory: scheduled factory reset on next boot.
648+
Reboot now to perform reset, (y/N)? y
649+
[ OK ] Saving system time (UTC) to RTC
650+
[ OK ] Stopping mDNS alias advertiser
651+
...
652+
[ OK ] Starting Configuration daemon
653+
[ OK ] Loading startup-config
654+
[ OK ] Update DNS configuration
655+
[ OK ] Verifying self-signed https certificate
656+
[ OK ] Starting Status daemon
657+
[ OK ] Starting Status daemon
658+
659+
660+
Please press Enter to activate this console.
661+
662+
Infix -- a Network Operating System v25.01.0 (ttyS0)
663+
example login:
664+
```
665+
326666
[1]: https://www.rfc-editor.org/rfc/rfc7317
327667
[2]: https://github.com/kernelkit/infix/blob/main/src/confd/yang/infix-system%402024-02-29.yang
328668
[3]: https://www.rfc-editor.org/rfc/rfc8341
329669
[4]: https://chrony-project.org/doc/4.6.1/chronyc.html
670+
[5]: https://github.com/kernelkit/infix/blob/main/src/confd/yang/infix-system-software.yang
671+
[6]: boot.md#system-configuration
672+
[7]: introduction.md#system-boot
673+
[8]: management.md#console-port
674+
[^9]: In failure config, Infix puts all Ethernet ports as individual
675+
interfaces. With direct access, one can connect with e.g., SSH,
676+
using link local IPv6 addresses. This as an alternative to
677+
connecting via a console port.

0 commit comments

Comments
 (0)