Skip to content

Commit 6223825

Browse files
authored
Merge pull request #71991 from tmalove/etcd-known-issues-29657-tlove
OCPBUGS#29657: Mention known issues while moving etcd to a different disk
2 parents 1c39ca0 + 2fe5873 commit 6223825

File tree

1 file changed

+121
-140
lines changed

1 file changed

+121
-140
lines changed

modules/move-etcd-different-disk.adoc

Lines changed: 121 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -12,18 +12,32 @@ The Machine Config Operator (MCO) is responsible for mounting a secondary disk f
1212

1313
[NOTE]
1414
====
15-
This procedure does not move parts of the root file system, such as `/var/`, to another disk or partition on an installed node.
15+
This encoded script only supports device names for the following device types:
16+
17+
SCSI or SATA:: `/dev/sd*`
18+
Virtual device:: `/dev/vd*`
19+
NVMe:: `/dev/nvme*[0-9]\*n*`
1620
====
1721

22+
.Limitations
23+
24+
* When the new disk is attached to the cluster, the etcd database is part of the root mount. It is not part of the secondary disk or the intended disk when the primary node is recreated. As a result, the primary node will not create a separate `/var/lib/etcd` mount.
25+
1826
.Prerequisites
1927

2028
* You have installed the {oc-first}.
2129
* You have access to the cluster with `cluster-admin` privileges.
30+
* Add additional disks before uploading the machine configuration.
2231
* The `MachineConfigPool` must match `metadata.labels[machineconfiguration.openshift.io/role]`. This applies to a controller, worker, or a custom pool.
2332
33+
[NOTE]
34+
====
35+
This procedure does not move parts of the root file system, such as `/var/`, to another disk or partition on an installed node.
36+
====
37+
2438
.Procedure
2539

26-
. Attach the new disk to the cluster and verify that the disk is detected in the node by using the `lsblk` command in a debug shell:
40+
. Attach the new disk to the cluster and verify that the disk is detected in the node by running the `lsblk` command in a debug shell:
2741
+
2842
[source,terminal]
2943
----
@@ -37,116 +51,29 @@ $ oc debug node/<node_name>
3751
+
3852
Note the device name of the new disk reported by the `lsblk` command.
3953

40-
. Create a `MachineConfig` YAML file named `etcd-mc.yml` with contents such as the following, replacing instances of `<new_disk_name>` with the noted device name:
41-
+
42-
[source,yaml]
43-
----
44-
apiVersion: machineconfiguration.openshift.io/v1
45-
kind: MachineConfig
46-
metadata:
47-
labels:
48-
machineconfiguration.openshift.io/role: master
49-
name: 98-var-lib-etcd
50-
spec:
51-
config:
52-
ignition:
53-
version: 3.4.0
54-
systemd:
55-
units:
56-
- contents: |
57-
[Unit]
58-
Description=Make File System on /dev/<new_disk_name>
59-
DefaultDependencies=no
60-
BindsTo=dev-<new_disk_name>.device
61-
After=dev-<new_disk_name>.device var.mount
62-
Before=systemd-fsck@dev-<new_disk_name>.service
63-
64-
[Service]
65-
Type=oneshot
66-
RemainAfterExit=yes
67-
ExecStart=/usr/lib/systemd/systemd-makfs.xfs -f /dev/<new_disk_name>
68-
TimeoutSec=0
69-
70-
[Install]
71-
WantedBy=var-lib-containers.mount
72-
enabled: true
73-
name: systemd-mkfs@dev-<new_disk_name>.service
74-
- contents: |
75-
[Unit]
76-
Description=Mount /dev/<new_disk_name> to /var/lib/etcd
77-
Before=local-fs.target
78-
Requires=systemd-mkfs@dev-<new_disk_name>.service
79-
After=systemd-mkfs@dev-<new_disk_name>.service var.mount
80-
81-
[Mount]
82-
What=/dev/<new_disk_name>
83-
Where=/var/lib/etcd
84-
Type=xfs
85-
Options=defaults,prjquota
86-
87-
[Install]
88-
WantedBy=local-fs.target
89-
enabled: true
90-
name: var-lib-etcd.mount
91-
- contents: |
92-
[Unit]
93-
Description=Sync etcd data if new mount is empty
94-
DefaultDependencies=no
95-
After=var-lib-etcd.mount var.mount
96-
Before=crio.service
97-
98-
[Service]
99-
Type=oneshot
100-
RemainAfterExit=yes
101-
ExecCondition=/usr/bin/test ! -d /var/lib/etcd/member
102-
ExecStart=semanage fcontext -a -e /sysroot/ostree/deploy/rhcos/var/lib/etcd/ /var/lib/etcd/
103-
ExecStart=/bin/rsync -ar /sysroot/ostree/deploy/rhcos/var/lib/etcd/ /var/lib/etcd/
104-
TimeoutSec=0
105-
106-
[Install]
107-
WantedBy=multi-user.target graphical.target
108-
enabled: true
109-
name: sync-var-lib-etcd-to-etcd.service
110-
- contents: |
111-
[Unit]
112-
Description=Restore recursive SELinux security contexts
113-
DefaultDependencies=no
114-
After=var-lib-etcd.mount
115-
Before=crio.service
116-
117-
[Service]
118-
Type=oneshot
119-
RemainAfterExit=yes
120-
ExecStart=/sbin/restorecon -R /var/lib/etcd/
121-
TimeoutSec=0
122-
123-
[Install]
124-
WantedBy=multi-user.target graphical.target
125-
enabled: true
126-
name: restorecon-var-lib-etcd.service
127-
----
128-
129-
. Log in to the cluster as a user with `cluster-admin` privileges and create the machine configuration:
54+
. Decode and replace the device name in the script according to your environment.
13055
+
131-
[source,terminal]
132-
----
133-
$ oc login -u <username> -p <password>
134-
----
135-
+
136-
[source,terminal]
137-
----
138-
$ oc create -f etcd-mc.yml
139-
----
140-
+
141-
The nodes are updated and rebooted. After the reboot completes, the following events occur:
142-
+
143-
* An XFS file system is created on the specified disk.
144-
* The disk mounts to `/var/lib/etcd`.
145-
* The content from `/sysroot/ostree/deploy/rhcos/var/lib/etcd` syncs to `/var/lib/etcd`.
146-
* A restore of `SELinux` labels is forced for `/var/lib/etcd`.
147-
* The old content is not removed.
148-
149-
. After the nodes are on a separate disk, update the `etcd-mc.yml` file with contents such as the following, replacing instances of `<new_disk_name>` with the noted device name:
56+
[source,bash]
57+
----
58+
#!/bin/bash
59+
set -uo pipefail
60+
61+
for device in <device_type_glob>; do # <1>
62+
/usr/sbin/blkid $device &> /dev/null
63+
if [ $? == 2 ]; then
64+
echo "secondary device found $device"
65+
echo "creating filesystem for etcd mount"
66+
mkfs.xfs -L var-lib-etcd -f $device &> /dev/null
67+
udevadm settle
68+
touch /etc/var-lib-etcd-mount
69+
exit
70+
fi
71+
done
72+
echo "Couldn't find secondary block device!" >&2
73+
exit 77
74+
----
75+
<1> Replace `<device_type_glob>` with a shell glob for your block device type. For SCSI or SATA drives, use `/dev/sd*`; for virtual drives, use `/dev/vd*`; for NVMe drives, use `/dev/nvme*[0-9]\*n*`.
76+
. Create a `MachineConfig` YAML file named `etcd-mc.yml` with contents such as the following:
15077
+
15178
[source,yaml]
15279
----
@@ -159,37 +86,91 @@ metadata:
15986
spec:
16087
config:
16188
ignition:
162-
version: 3.4.0
89+
version: 3.1.0
90+
storage:
91+
files:
92+
- path: /etc/find-secondary-device
93+
mode: 0755
94+
contents:
95+
source: data:text/plain;charset=utf-8;base64,<encoded_etc_find_secondary_device_script> # <1>
16396
systemd:
16497
units:
165-
- contents: |
166-
[Unit]
167-
Description=Mount /dev/<new_disk_name> to /var/lib/etcd
168-
Before=local-fs.target
169-
After=var.mount
170-
171-
[Mount]
172-
What=/dev/<new_disk_name>
173-
Where=/var/lib/etcd
174-
Type=xfs
175-
Options=defaults,prjquota
176-
177-
[Install]
178-
WantedBy=local-fs.target
179-
enabled: true
180-
name: var-lib-etcd.mount
181-
----
182-
183-
. Apply the modified version that removes the logic for creating and syncing the device to prevent the nodes from rebooting:
184-
+
185-
[source,terminal]
186-
----
187-
$ oc replace -f etcd-mc.yml
188-
----
98+
- name: find-secondary-device.service
99+
enabled: true
100+
contents: |
101+
[Unit]
102+
Description=Find secondary device
103+
DefaultDependencies=false
104+
After=systemd-udev-settle.service
105+
Before=local-fs-pre.target
106+
ConditionPathExists=!/etc/var-lib-etcd-mount
107+
108+
[Service]
109+
RemainAfterExit=yes
110+
ExecStart=/etc/find-secondary-device
111+
112+
RestartForceExitStatus=77
113+
114+
[Install]
115+
WantedBy=multi-user.target
116+
- name: var-lib-etcd.mount
117+
enabled: true
118+
contents: |
119+
[Unit]
120+
Before=local-fs.target
121+
122+
[Mount]
123+
What=/dev/disk/by-label/var-lib-etcd
124+
Where=/var/lib/etcd
125+
Type=xfs
126+
TimeoutSec=120s
127+
128+
[Install]
129+
RequiredBy=local-fs.target
130+
- name: sync-var-lib-etcd-to-etcd.service
131+
enabled: true
132+
contents: |
133+
[Unit]
134+
Description=Sync etcd data if new mount is empty
135+
DefaultDependencies=no
136+
After=var-lib-etcd.mount var.mount
137+
Before=crio.service
138+
139+
[Service]
140+
Type=oneshot
141+
RemainAfterExit=yes
142+
ExecCondition=/usr/bin/test ! -d /var/lib/etcd/member
143+
ExecStart=/usr/sbin/setsebool -P rsync_full_access 1
144+
ExecStart=/bin/rsync -ar /sysroot/ostree/deploy/rhcos/var/lib/etcd/ /var/lib/etcd/
145+
ExecStart=/usr/sbin/semanage fcontext -a -t container_var_lib_t '/var/lib/etcd(/.*)?'
146+
ExecStart=/usr/sbin/setsebool -P rsync_full_access 0
147+
TimeoutSec=0
148+
149+
[Install]
150+
WantedBy=multi-user.target graphical.target
151+
- name: restorecon-var-lib-etcd.service
152+
enabled: true
153+
contents: |
154+
[Unit]
155+
Description=Restore recursive SELinux security contexts
156+
DefaultDependencies=no
157+
After=var-lib-etcd.mount
158+
Before=crio.service
159+
160+
[Service]
161+
Type=oneshot
162+
RemainAfterExit=yes
163+
ExecStart=/sbin/restorecon -R /var/lib/etcd/
164+
TimeoutSec=0
165+
166+
[Install]
167+
WantedBy=multi-user.target graphical.target
168+
----
169+
<1> Use the encoded string that you previously created and replace it with the encoded script that you noted.
189170
190171
.Verification steps
191172
192-
* Run the `grep <new_disk_name> /proc/mounts` command in a debug shell for the node to ensure that the disk mounted:
173+
* Run the `grep /var/lib/etcd /proc/mounts` command in a debug shell for the node to ensure that the disk is mounted:
193174
+
194175
[source,terminal]
195176
----
@@ -198,12 +179,12 @@ $ oc debug node/<node_name>
198179
+
199180
[source,terminal]
200181
----
201-
# grep <new_disk_name> /proc/mounts
182+
# grep -w "/var/lib/etcd" /proc/mounts
202183
----
203184
+
204185
.Example output
205186
+
206187
[source,terminal]
207188
----
208-
/dev/nvme1n1 /var/lib/etcd xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,prjquota 0 0
209-
----
189+
/dev/sdb /var/lib/etcd xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
190+
----

0 commit comments

Comments
 (0)