Skip to content

Commit 97e2356

Browse files
authored
Merge pull request #84216 from aravipra/OSDOCS-12095
OSDOCS-12095: adding autorecover from manual backups
2 parents 853dd9d + 11d1082 commit 97e2356

7 files changed

+354
-0
lines changed

_topic_maps/_topic_map_ms.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -227,6 +227,8 @@ Distros: microshift
227227
Topics:
228228
- Name: Backing up and restoring data
229229
File: microshift-backup-and-restore
230+
- Name: Auto recovery from manual backups
231+
File: microshift-auto-recover-manual-backup
230232
---
231233
Name: Troubleshooting
232234
Dir: microshift_troubleshooting
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
[id="microshift-auto-recover-manual-backup"]
3+
= Auto recovery from manual backups
4+
include::_attributes/attributes-microshift.adoc[]
5+
:context: microshift-auto-recover-manual-backup
6+
7+
toc::[]
8+
9+
You can automatically restore data from manual backups when {microshift-short} fails to start by using the `auto-recovery` feature.
10+
11+
You can use the following options with the existing `backup` and `restore` commands in this feature:
12+
13+
* `--auto-recovery`: Selects the most recent version of the backup, and then restores it. This option treats the `PATH` argument as a path to a directory that holds all the backups for auto-recovery, and not just as a path to a particular backup file.
14+
* `--dont-save-failed`: Disables the backup of failed {microshift-short} data.
15+
16+
[NOTE]
17+
====
18+
* You can use the `--auto-recovery` option with both the `backup` and `restore` commands.
19+
* You can use the `--dont-save-failed` option only with the `restore` command.
20+
====
21+
22+
include::modules/microshift-creating-backups.adoc[leveloffset=+1]
23+
24+
include::modules/microshift-restoring-backups.adoc[leveloffset=+1]
25+
26+
include::modules/microshift-automation-example-rpm-systems.adoc[leveloffset=+1]
27+
28+
include::modules/microshift-automation-example-ostree-systems.adoc[leveloffset=+1]
29+
30+
include::modules/microshift-automation-example-bootc-systems.adoc[leveloffset=+1]
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * microshift/microshift_backup_and_restore/microshift-auto-recover-manual-backup.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="microshift-automation-example-bootc-systems_{context}"]
7+
= Automating the integration process with systemd for bootc systems
8+
9+
[IMPORTANT]
10+
====
11+
You must include the entire `auto-recovery` process for bootc systems that use `systemd` in the container file.
12+
====
13+
14+
As a use case, consider the following example situation in which you want to automate the `auto-recovery` process for bootc systems that use systemd.
15+
16+
.Prerequisites
17+
18+
* You have created the `10-auto-recovery.conf` and `microshift-auto-recovery.service` files as explained in the the "Automating the integration process with systemd for RPM systems" section.
19+
* You have created the `microshift-auto-recovery` script as explained in the the "Automating the integration process with systemd for RPM systems" section.
20+
21+
.Procedure
22+
23+
* Use the following example to update your Containerfile that you use to prepare the bootc image.
24+
+
25+
[source,text]
26+
----
27+
RUN mkdir -p /usr/lib/systemd/system/microshift.service.d
28+
COPY ./auto-rec/10-auto-recovery.conf /usr/lib/systemd/system/microshift.service.d/10-auto-recovery.conf
29+
COPY ./auto-rec/microshift-auto-recovery.service /usr/lib/systemd/system/
30+
COPY ./auto-rec/microshift-auto-recovery /usr/bin/
31+
RUN chmod +x /usr/bin/microshift-auto-recovery && systemctl daemon-reload
32+
----
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * microshift/microshift_backup_and_restore/microshift-auto-recover-manual-backup.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="microshift-automation-example-ostree-systems_{context}"]
7+
= Automating the integration process with systemd for OSTree systems
8+
9+
[IMPORTANT]
10+
====
11+
You must include the entire `auto-recovery` process for OSTree systems that use `systemd` in the blueprint file.
12+
====
13+
14+
As a use case, consider the following example situation in which you want to automate the `auto-recovery` process for OSTree systems that use systemd.
15+
16+
.Procedure
17+
18+
. Use the following example to create your blueprint file:
19+
+
20+
[source,terminal]
21+
----
22+
[[customizations.files]]
23+
path = "/usr/lib/systemd/system/microshift.service.d/10-auto-recovery.conf"
24+
data = """
25+
[Unit]
26+
OnFailure=microshift-auto-recovery.service
27+
"""
28+
29+
[[customizations.files]]
30+
path = "/usr/lib/systemd/system/microshift-auto-recovery.service"
31+
data = """
32+
[Unit]
33+
Description=MicroShift auto-recovery
34+
35+
[Service]
36+
Type=oneshot
37+
ExecStart=/usr/bin/microshift-auto-recovery
38+
39+
[Install]
40+
WantedBy=multi-user.target
41+
"""
42+
43+
[[customizations.files]]
44+
path = "/usr/bin/microshift-auto-recovery"
45+
mode = "0755"
46+
data = """
47+
#!/usr/bin/env bash
48+
set -xeuo pipefail
49+
50+
# If greenboot uses a non-default file for clearing boot_counter, use boot_success instead.
51+
if grep -q "/boot/grubenv" /usr/libexec/greenboot/greenboot-grub2-set-success; then
52+
if grub2-editenv - list | grep -q ^boot_success=0; then
53+
echo "Greenboot didn't decide the system is healthy after staging a new deployment."
54+
echo "Quitting to not interfere with the process"
55+
exit 0
56+
fi
57+
else
58+
if grub2-editenv - list | grep -q ^boot_counter=; then
59+
echo "Greenboot didn't decide the system is healthy after staging a new deployment."
60+
echo "Quitting to not interfere with the process"
61+
exit 0
62+
fi
63+
fi
64+
65+
/usr/bin/microshift restore --auto-recovery /var/lib/microshift-auto-recovery
66+
/usr/bin/systemctl reset-failed microshift
67+
/usr/bin/systemctl start microshift
68+
69+
echo "DONE"
70+
"""
71+
----
72+
. For the next steps, see link:https://docs.redhat.com/en/documentation/red_hat_build_of_microshift/{ocp-version}/html/embedding_in_a_rhel_for_edge_image/microshift-embed-in-rpm-ostree#preparing-for-image-building_microshift-embed-in-rpm-ostree[Preparing for image building].
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * microshift/microshift_backup_and_restore/microshift-auto-recover-manual-backup.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="microshift-automation-example-rpm-systems_{context}"]
7+
= Automating the integration process with systemd for RPM systems
8+
9+
[NOTE]
10+
====
11+
When the `microshift.service` enters a failed state, `systemd` starts the `microshift-auto-recovery.service` unit. This unit executes the `auto-recovery` restore process and restarts {microshift-short}.
12+
====
13+
14+
As a use case, consider the following example situation in which you want to automate the `auto-recovery` process for RPM systems that use systemd.
15+
16+
.Procedure
17+
18+
. Create a directory for the `microshift.service` by running the following command:
19+
+
20+
[source,terminal]
21+
----
22+
$ sudo mkdir -p /usr/lib/systemd/system/microshift.service.d
23+
----
24+
. To instruct `systemd` to run `microshift-auto-recovery.service` when the `microshift.service` fails, create the `10-auto-recovery.conf` file by running the following command:
25+
+
26+
[source,terminal]
27+
----
28+
$ sudo tee /usr/lib/systemd/system/microshift.service.d/10-auto-recovery.conf > /dev/null <<'EOF'
29+
[Unit]
30+
OnFailure=microshift-auto-recovery.service
31+
EOF
32+
----
33+
. Create the `microshift-auto-recovery.service` file by running the following command:
34+
+
35+
[source,terminal]
36+
----
37+
$ sudo tee /usr/lib/systemd/system/microshift-auto-recovery.service > /dev/null <<'EOF'
38+
[Unit]
39+
Description=MicroShift auto-recovery
40+
41+
[Service]
42+
Type=oneshot
43+
ExecStart=/usr/bin/microshift-auto-recovery
44+
45+
[Install]
46+
WantedBy=multi-user.target
47+
EOF
48+
----
49+
. Create the `microshift-auto-recovery` script by running the following command:
50+
+
51+
[source,terminal]
52+
----
53+
$ sudo tee /usr/bin/microshift-auto-recovery > /dev/null <<'EOF'
54+
#!/usr/bin/env bash
55+
set -xeuo pipefail
56+
57+
# If greenboot uses a non-default file for clearing boot_counter, use boot_success instead.
58+
if grep -q "/boot/grubenv" /usr/libexec/greenboot/greenboot-grub2-set-success; then
59+
if grub2-editenv - list | grep -q ^boot_success=0; then
60+
echo "Greenboot didn't decide the system is healthy after staging new deployment."
61+
echo "Quitting to not interfere with the process"
62+
exit 0
63+
fi
64+
else
65+
if grub2-editenv - list | grep -q ^boot_counter=; then
66+
echo "Greenboot didn't decide the system is healthy after staging a new deployment."
67+
echo "Quitting to not interfere with the process"
68+
exit 0
69+
fi
70+
fi
71+
72+
/usr/bin/microshift restore --auto-recovery /var/lib/microshift-auto-recovery
73+
/usr/bin/systemctl reset-failed microshift
74+
/usr/bin/systemctl start microshift
75+
76+
echo "DONE"
77+
EOF
78+
----
79+
. Make the script executable by running the following command:
80+
+
81+
[source,terminal]
82+
----
83+
$ sudo chmod +x /usr/bin/microshift-auto-recovery
84+
----
85+
. Reload the system configuration by running the following command:
86+
+
87+
[source,terminal]
88+
----
89+
$ sudo systemctl daemon-reload
90+
----
91+
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * microshift/microshift_backup_and_restore/microshift-auto-recover-manual-backup.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="microshift-creating-backups_{context}"]
7+
= Creating backups using the auto-recovery feature
8+
9+
Use the following procedure to create backups.
10+
11+
[NOTE]
12+
====
13+
Creating backups require stopping {microshift-short}, so you must determine the best time to stop {microshift-short}.
14+
====
15+
16+
.Prerequisites
17+
18+
* You have stopped {microshift-short}.
19+
20+
.Procedure
21+
22+
* Create and store backups in the directory you choose by running the following command:
23+
+
24+
[source,terminal]
25+
[subs="+quotes"]
26+
----
27+
$ sudo microshift backup --auto-recovery _<path_of_directory>_ <1>
28+
----
29+
<1> Replace `_<path_of_directory>_` with the path of the directory that stores backups. For example, `/var/lib/microshift-auto-recovery`.
30+
+
31+
[NOTE]
32+
====
33+
The `--auto-recovery` option modifies the interpretation of the `PATH` argument from the final backup path to a directory that holds all the backups for auto-recovery.
34+
====
35+
+
36+
.Example output
37+
+
38+
[source,terminal]
39+
----
40+
??? I1104 09:18:52.100725 8906 system.go:58] "OSTree deployments" deployments=[{"id":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","booted":true,"staged":false,"pinned":false},{"id":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","booted":false,"staged":false,"pinned":false}]
41+
??? I1104 09:18:52.100895 8906 data_manager.go:83] "Copying data to backup directory" storage="/var/lib/microshift-auto-recovery" name="20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" data="/var/lib/microshift"
42+
??? I1104 09:18:52.102296 8906 disk_space.go:33] Calculated size of "/var/lib/microshift": 261M - increasing by 10% for safety: 287M
43+
??? I1104 09:18:52.102321 8906 disk_space.go:44] Calculated available disk space for "/var/lib/microshift-auto-recovery": 1658M
44+
??? I1104 09:18:52.105700 8906 atomic_dir_copy.go:66] "Made an intermediate copy" cmd="/bin/cp --verbose --recursive --preserve --reflink=auto /var/lib/microshift /var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.99142"
45+
??? I1104 09:18:52.105732 8906 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.99142" dest="/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"
46+
??? I1104 09:18:52.105749 8906 data_manager.go:120] "Copied data to backup directory" backup="/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" data="/var/lib/microshift"
47+
/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1
48+
----
49+
50+
.Verification
51+
52+
* To verify that the backup has been created, view the directory you chose to store backups by running the following command:
53+
+
54+
[source,terminal]
55+
[subs="+quotes"]
56+
----
57+
$ ls -la _<path_of_directory>_ <1>
58+
----
59+
<1> Replace `_<path_of_directory>_` with the path of the directory that stores backups. For example, `/var/lib/microshift-auto-recovery`.
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * microshift/microshift_backup_and_restore/microshift-auto-recover-manual-backup.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="microshift-restoring-backups_{context}"]
7+
= Restoring backups using the auto-recovery feature
8+
9+
You can restore backups after system events that remove or damage required data.
10+
11+
Use the following procedure to restore backups.
12+
13+
.Prerequisites
14+
15+
* You have stopped {microshift-short}.
16+
17+
.Procedure
18+
19+
* Restore the backup from the directory in which you have stored the backups by running the following command:
20+
+
21+
[source,terminal]
22+
[subs="+quotes"]
23+
----
24+
$ sudo microshift restore --auto-recovery _<path_of_directory>_ <1>
25+
----
26+
<1> Replace `_<path_of_directory>_` with the path of the directory that stores backups. For example, `/var/lib/microshift-auto-recovery`.
27+
+
28+
[NOTE]
29+
====
30+
The `--auto-recovery` option copies the {microshift-short} data to `/var/lib/microshift-auto-recovery/failed/` for later investigation, selects the most recent backup, and restores it.
31+
The `--dont-save-failed` option disables the backing up of failed {microshift-short} data.
32+
====
33+
+
34+
.Example output
35+
+
36+
[source,terminal]
37+
----
38+
??? I1104 09:19:28.617225 8950 state.go:80] "Read state from the disk" state={"LastBackup":"20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"}
39+
??? I1104 09:19:28.617323 8950 storage.go:78] "Auto-recovery backup storage read and parsed" dirs=["20241022101255_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","20241022101520_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","restored"] backups=[{"CreationTime":"2024-10-22T10:12:55Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-10-22T10:15:20Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-10-22T10:15:28Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}]
40+
??? I1104 09:19:28.617350 8950 storage.go:40] "Filtered list of backups - removed previously restored backup" removed="20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0" newList=[{"CreationTime":"2024-10-22T10:12:55Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-10-22T10:15:20Z","Version":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"},{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}]
41+
??? I1104 09:19:28.633237 8950 system.go:58] "OSTree deployments" deployments=[{"id":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","booted":true,"staged":false,"pinned":false},{"id":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","booted":false,"staged":false,"pinned":false}]
42+
??? I1104 09:19:28.633258 8950 storage.go:49] "Filtered list of backups by version" version="default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1" newList=[{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}]
43+
??? I1104 09:19:28.633268 8950 restore.go:170] "Potential backups" bz=[{"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}]
44+
??? I1104 09:19:28.633277 8950 restore.go:173] "Candidate backup for restore" b={"CreationTime":"2024-11-04T09:18:52Z","Version":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"}
45+
??? I1104 09:19:28.634007 8950 disk_space.go:33] Calculated size of "/var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1": 261M - increasing by 10% for safety: 287M
46+
??? I1104 09:19:28.634096 8950 disk_space.go:44] Calculated available disk space for "/var/lib": 1658M
47+
??? I1104 09:19:28.634507 8950 disk_space.go:33] Calculated size of "/var/lib/microshift": 261M - increasing by 10% for safety: 287M
48+
??? I1104 09:19:28.634522 8950 disk_space.go:44] Calculated available disk space for "/var/lib/microshift-auto-recovery": 1658M
49+
??? I1104 09:19:28.649719 8950 system.go:58] "OSTree deployments" deployments=[{"id":"default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1","booted":true,"staged":false,"pinned":false},{"id":"default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0","booted":false,"staged":false,"pinned":false}]
50+
??? I1104 09:19:28.653880 8950 atomic_dir_copy.go:66] "Made an intermediate copy" cmd="/bin/cp --verbose --recursive --preserve --reflink=auto /var/lib/microshift /var/lib/microshift-auto-recovery/failed/20241104091928_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.22742"
51+
??? I1104 09:19:28.657362 8950 atomic_dir_copy.go:66] "Made an intermediate copy" cmd="/bin/cp --verbose --recursive --preserve --reflink=auto /var/lib/microshift-auto-recovery/20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1 /var/lib/microshift.tmp.482"
52+
??? I1104 09:19:28.657385 8950 state.go:40] "Saving intermediate state" state="{\"LastBackup\":\"20241104091852_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1\"}" path="/var/lib/microshift-auto-recovery/state.json.tmp.41544"
53+
??? I1104 09:19:28.662438 8950 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift.tmp.482" dest="/var/lib/microshift"
54+
??? I1104 09:19:28.662451 8950 state.go:46] "Moving state file to final path" intermediatePath="/var/lib/microshift-auto-recovery/state.json.tmp.41544" finalPath="/var/lib/microshift-auto-recovery/state.json"
55+
??? I1104 09:19:28.662521 8950 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift-auto-recovery/failed/20241104091928_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1.tmp.22742" dest="/var/lib/microshift-auto-recovery/failed/20241104091928_default-b3442053c9ce69310cd54140d8d592234c5306e4c5132de6efe615f79c84300a.1"
56+
??? I1104 09:19:28.662969 8950 atomic_dir_copy.go:115] "Renamed to final destination" src="/var/lib/microshift-auto-recovery/20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0" dest="/var/lib/microshift-auto-recovery/restored/20241022101528_default-a129624b9233fa54fe3574f1aa211bc2d85e1052b52245fe7d83f10c2f6d28e3.0"
57+
??? I1104 09:19:28.662983 8950 restore.go:141] "Auto-recovery restore completed".
58+
----
59+
+
60+
[NOTE]
61+
====
62+
* The `restore` command does not start {microshift-short} after restoration. When you execute this command, {microshift-short} service has already failed or you need to stop it.
63+
* {microshift-short} does not monitor the disk space of any filesystem. You need to ensure your automation handles old backup removal.
64+
====
65+
66+
.Verification
67+
68+
* Verify that {microshift-short} has started successfully.

0 commit comments

Comments
 (0)