Skip to content

Commit 38fa675

Browse files
committed
OCPBUGS#16663: Remove provisioning with new-master-machine.yaml
New master nodes are now automatically provisioned, making it unnecessary to manually create new machines.
1 parent f06721a commit 38fa675

File tree

3 files changed

+8
-320
lines changed

3 files changed

+8
-320
lines changed

modules/dr-restoring-cluster-state.adoc

Lines changed: 2 additions & 110 deletions
Original file line numberDiff line numberDiff line change
@@ -465,125 +465,17 @@ clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us
465465
----
466466
<1> This is the control plane machine for the lost control plane host, `ip-10-0-131-183.ec2.internal`.
467467

468-
.. Save the machine configuration to a file on your file system by running:
469-
+
470-
[source,terminal]
471-
----
472-
$ oc get machine clustername-8qw5l-master-0 \ <1>
473-
-n openshift-machine-api \
474-
-o yaml \
475-
> new-master-machine.yaml
476-
----
477-
<1> Specify the name of the control plane machine for the lost control plane host.
478-
479-
.. Edit the `new-master-machine.yaml` file that was created in the previous step to assign a new name and remove unnecessary fields.
480-
481-
... Remove the entire `status` section by running:
482-
+
483-
[source,terminal]
484-
----
485-
status:
486-
addresses:
487-
- address: 10.0.131.183
488-
type: InternalIP
489-
- address: ip-10-0-131-183.ec2.internal
490-
type: InternalDNS
491-
- address: ip-10-0-131-183.ec2.internal
492-
type: Hostname
493-
lastUpdated: "2020-04-20T17:44:29Z"
494-
nodeRef:
495-
kind: Node
496-
name: ip-10-0-131-183.ec2.internal
497-
uid: acca4411-af0d-4387-b73e-52b2484295ad
498-
phase: Running
499-
providerStatus:
500-
apiVersion: awsproviderconfig.openshift.io/v1beta1
501-
conditions:
502-
- lastProbeTime: "2020-04-20T16:53:50Z"
503-
lastTransitionTime: "2020-04-20T16:53:50Z"
504-
message: machine successfully created
505-
reason: MachineCreationSucceeded
506-
status: "True"
507-
type: MachineCreation
508-
instanceId: i-0fdb85790d76d0c3f
509-
instanceState: stopped
510-
kind: AWSMachineProviderStatus
511-
----
512-
513-
... Change the `metadata.name` field to a new name by running:
514-
+
515-
It is recommended to keep the same base name as the old machine and change the ending number to the next available number. In this example, `clustername-8qw5l-master-0` is changed to `clustername-8qw5l-master-3`:
516-
+
517-
[source,terminal]
518-
----
519-
apiVersion: machine.openshift.io/v1beta1
520-
kind: Machine
521-
metadata:
522-
...
523-
name: clustername-8qw5l-master-3
524-
...
525-
----
526-
527-
... Remove the `spec.providerID` field by running:
528-
+
529-
[source,terminal]
530-
----
531-
providerID: aws:///us-east-1a/i-0fdb85790d76d0c3f
532-
----
533-
534-
... Remove the `metadata.annotations` and `metadata.generation` fields by running:
535-
+
536-
[source,terminal]
537-
----
538-
annotations:
539-
machine.openshift.io/instance-state: running
540-
...
541-
generation: 2
542-
----
543-
544-
... Remove the `metadata.resourceVersion` and `metadata.uid` fields by running:
545-
+
546-
[source,terminal]
547-
----
548-
resourceVersion: "13291"
549-
uid: a282eb70-40a2-4e89-8009-d05dd420d31a
550-
----
551-
552468
.. Delete the machine of the lost control plane host by running:
553469
+
554470
[source,terminal]
555471
----
556472
$ oc delete machine -n openshift-machine-api clustername-8qw5l-master-0 <1>
557473
----
558474
<1> Specify the name of the control plane machine for the lost control plane host.
559-
560-
.. Verify that the machine was deleted by running:
561-
+
562-
[source,terminal]
563-
----
564-
$ oc get machines -n openshift-machine-api -o wide
565-
----
566-
+
567-
Example output:
568475
+
569-
[source,terminal]
570-
----
571-
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE
572-
clustername-8qw5l-master-1 Running m4.xlarge us-east-1 us-east-1b 3h37m ip-10-0-143-125.ec2.internal aws:///us-east-1b/i-096c349b700a19631 running
573-
clustername-8qw5l-master-2 Running m4.xlarge us-east-1 us-east-1c 3h37m ip-10-0-154-194.ec2.internal aws:///us-east-1c/i-02626f1dba9ed5bba running
574-
clustername-8qw5l-worker-us-east-1a-wbtgd Running m4.large us-east-1 us-east-1a 3h28m ip-10-0-129-226.ec2.internal aws:///us-east-1a/i-010ef6279b4662ced running
575-
clustername-8qw5l-worker-us-east-1b-lrdxb Running m4.large us-east-1 us-east-1b 3h28m ip-10-0-144-248.ec2.internal aws:///us-east-1b/i-0cb45ac45a166173b running
576-
clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us-east-1c 3h28m ip-10-0-170-181.ec2.internal aws:///us-east-1c/i-06861c00007751b0a running
577-
----
578-
579-
.. Create a machine by using the `new-master-machine.yaml` file by running:
580-
+
581-
[source,terminal]
582-
----
583-
$ oc apply -f new-master-machine.yaml
584-
----
476+
A new machine is automatically provisioned after deleting the machine of the lost control plane host.
585477

586-
.. Verify that the new machine has been created by running:
478+
.. Verify that a new machine has been created by running:
587479
+
588480
[source,terminal]
589481
----

modules/restore-replace-stopped-baremetal-etcd-member.adoc

Lines changed: 4 additions & 117 deletions
Original file line numberDiff line numberDiff line change
@@ -173,11 +173,7 @@ $ oc delete secret etcd-serving-openshift-control-plane-2 -n openshift-etcd
173173
secret "etcd-serving-openshift-control-plane-2" deleted
174174
----
175175

176-
. Delete the control plane machine.
177-
+
178-
If you are running installer-provisioned infrastructure, or you used the Machine API to create your machines, follow these steps. Otherwise, you must create the new control plane node using the same method that was used to originally create it.
179-
180-
.. Obtain the machine for the unhealthy member.
176+
. Obtain the machine for the unhealthy member.
181177
+
182178
In a terminal that has access to the cluster as a `cluster-admin` user, run the following command:
183179
+
@@ -198,110 +194,6 @@ examplecluster-compute-1 Running 165m opens
198194
----
199195
<1> This is the control plane machine for the unhealthy node, `examplecluster-control-plane-2`.
200196

201-
.. Save the machine configuration to a file on your file system:
202-
+
203-
[source,terminal]
204-
----
205-
$ oc get machine examplecluster-control-plane-2 \ <1>
206-
-n openshift-machine-api \
207-
-o yaml \
208-
> new-master-machine.yaml
209-
----
210-
<1> Specify the name of the control plane machine for the unhealthy node.
211-
212-
.. Edit the `new-master-machine.yaml` file that was created in the previous step to assign a new name and remove unnecessary fields.
213-
214-
... Remove the entire `status` section:
215-
+
216-
[source,yaml]
217-
----
218-
status:
219-
addresses:
220-
- address: ""
221-
type: InternalIP
222-
- address: fe80::4adf:37ff:feb0:8aa1%ens1f1.373
223-
type: InternalDNS
224-
- address: fe80::4adf:37ff:feb0:8aa1%ens1f1.371
225-
type: Hostname
226-
lastUpdated: "2020-04-20T17:44:29Z"
227-
nodeRef:
228-
kind: Machine
229-
name: fe80::4adf:37ff:feb0:8aa1%ens1f1.372
230-
uid: acca4411-af0d-4387-b73e-52b2484295ad
231-
phase: Running
232-
providerStatus:
233-
apiVersion: machine.openshift.io/v1beta1
234-
conditions:
235-
- lastProbeTime: "2020-04-20T16:53:50Z"
236-
lastTransitionTime: "2020-04-20T16:53:50Z"
237-
message: machine successfully created
238-
reason: MachineCreationSucceeded
239-
status: "True"
240-
type: MachineCreation
241-
instanceId: i-0fdb85790d76d0c3f
242-
instanceState: stopped
243-
kind: Machine
244-
----
245-
246-
. Change the `metadata.name` field to a new name.
247-
+
248-
It is recommended to keep the same base name as the old machine and change the ending number to the next available number. In this example, `examplecluster-control-plane-2` is changed to `examplecluster-control-plane-3`.
249-
+
250-
For example:
251-
+
252-
[source,yaml]
253-
----
254-
apiVersion: machine.openshift.io/v1beta1
255-
kind: Machine
256-
metadata:
257-
...
258-
name: examplecluster-control-plane-3
259-
...
260-
----
261-
262-
.. Remove the `spec.providerID` field:
263-
+
264-
[source,yaml]
265-
----
266-
providerID: baremetalhost:///openshift-machine-api/openshift-control-plane-2/3354bdac-61d8-410f-be5b-6a395b056135
267-
----
268-
269-
.. Remove the `metadata.annotations` and `metadata.generation` fields:
270-
+
271-
[source,yaml]
272-
----
273-
annotations:
274-
machine.openshift.io/instance-state: externally provisioned
275-
...
276-
generation: 2
277-
----
278-
279-
.. Remove the `spec.conditions`, `spec.lastUpdated`, `spec.nodeRef` and `spec.phase` fields:
280-
+
281-
[source,yaml]
282-
----
283-
lastTransitionTime: "2022-08-03T08:40:36Z"
284-
message: 'Drain operation currently blocked by: [{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]'
285-
reason: HookPresent
286-
severity: Warning
287-
status: "False"
288-
289-
type: Drainable
290-
lastTransitionTime: "2022-08-03T08:39:55Z"
291-
status: "True"
292-
type: InstanceExists
293-
294-
lastTransitionTime: "2022-08-03T08:36:37Z"
295-
status: "True"
296-
type: Terminable
297-
lastUpdated: "2022-08-03T08:40:36Z"
298-
nodeRef:
299-
kind: Node
300-
name: openshift-control-plane-2
301-
uid: 788df282-6507-4ea2-9a43-24f237ccbc3c
302-
phase: Running
303-
----
304-
305197
. Ensure that the Bare Metal Operator is available by running the following command:
306198
+
307199
[source,terminal]
@@ -345,6 +237,8 @@ If deletion of the machine is delayed for any reason or the command is obstructe
345237
Do not interrupt machine deletion by pressing `Ctrl+c`. You must allow the command to proceed to completion. Open a new terminal window to edit and delete the finalizer fields.
346238
====
347239
+
240+
A new machine is automatically provisioned after deleting the machine of the unhealthy member.
241+
+
348242
.. Edit the machine configuration by running the following command:
349243
+
350244
[source,terminal]
@@ -463,16 +357,9 @@ openshift-control-plane-2 available examplecluster-control-plane-3
463357
openshift-compute-0 provisioned examplecluster-compute-0 true 4h48m
464358
openshift-compute-1 provisioned examplecluster-compute-1 true 4h48m
465359
----
466-
+
467-
.. Create the new control plane machine using the `new-master-machine.yaml` file:
468-
+
469-
[source,terminal]
470-
----
471-
$ oc apply -f new-master-machine.yaml
472-
----
473360

474361

475-
.. Verify that the new machine has been created:
362+
.. Verify that a new machine has been created:
476363
+
477364
[source,terminal]
478365
----

modules/restore-replace-stopped-etcd-member.adoc

Lines changed: 2 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -210,108 +210,17 @@ clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us
210210
----
211211
<1> This is the control plane machine for the unhealthy node, `ip-10-0-131-183.ec2.internal`.
212212

213-
.. Save the machine configuration to a file on your file system:
214-
+
215-
[source,terminal]
216-
----
217-
$ oc get machine clustername-8qw5l-master-0 \ <1>
218-
-n openshift-machine-api \
219-
-o yaml \
220-
> new-master-machine.yaml
221-
----
222-
<1> Specify the name of the control plane machine for the unhealthy node.
223-
224-
.. Edit the `new-master-machine.yaml` file that was created in the previous step to assign a new name and remove unnecessary fields.
225-
226-
... Remove the entire `status` section:
227-
+
228-
[source,yaml]
229-
----
230-
status:
231-
addresses:
232-
- address: 10.0.131.183
233-
type: InternalIP
234-
- address: ip-10-0-131-183.ec2.internal
235-
type: InternalDNS
236-
- address: ip-10-0-131-183.ec2.internal
237-
type: Hostname
238-
lastUpdated: "2020-04-20T17:44:29Z"
239-
nodeRef:
240-
kind: Node
241-
name: ip-10-0-131-183.ec2.internal
242-
uid: acca4411-af0d-4387-b73e-52b2484295ad
243-
phase: Running
244-
providerStatus:
245-
apiVersion: awsproviderconfig.openshift.io/v1beta1
246-
conditions:
247-
- lastProbeTime: "2020-04-20T16:53:50Z"
248-
lastTransitionTime: "2020-04-20T16:53:50Z"
249-
message: machine successfully created
250-
reason: MachineCreationSucceeded
251-
status: "True"
252-
type: MachineCreation
253-
instanceId: i-0fdb85790d76d0c3f
254-
instanceState: stopped
255-
kind: AWSMachineProviderStatus
256-
----
257-
258-
... Change the `metadata.name` field to a new name.
259-
+
260-
It is recommended to keep the same base name as the old machine and change the ending number to the next available number. In this example, `clustername-8qw5l-master-0` is changed to `clustername-8qw5l-master-3`.
261-
+
262-
For example:
263-
+
264-
[source,yaml]
265-
----
266-
apiVersion: machine.openshift.io/v1beta1
267-
kind: Machine
268-
metadata:
269-
...
270-
name: clustername-8qw5l-master-3
271-
...
272-
----
273-
274-
... Remove the `spec.providerID` field:
275-
+
276-
[source,yaml]
277-
----
278-
providerID: aws:///us-east-1a/i-0fdb85790d76d0c3f
279-
----
280-
281213
.. Delete the machine of the unhealthy member:
282214
+
283215
[source,terminal]
284216
----
285217
$ oc delete machine -n openshift-machine-api clustername-8qw5l-master-0 <1>
286218
----
287219
<1> Specify the name of the control plane machine for the unhealthy node.
288-
289-
.. Verify that the machine was deleted:
290-
+
291-
[source,terminal]
292-
----
293-
$ oc get machines -n openshift-machine-api -o wide
294-
----
295220
+
296-
.Example output
297-
[source,terminal]
298-
----
299-
NAME PHASE TYPE REGION ZONE AGE NODE PROVIDERID STATE
300-
clustername-8qw5l-master-1 Running m4.xlarge us-east-1 us-east-1b 3h37m ip-10-0-154-204.ec2.internal aws:///us-east-1b/i-096c349b700a19631 running
301-
clustername-8qw5l-master-2 Running m4.xlarge us-east-1 us-east-1c 3h37m ip-10-0-164-97.ec2.internal aws:///us-east-1c/i-02626f1dba9ed5bba running
302-
clustername-8qw5l-worker-us-east-1a-wbtgd Running m4.large us-east-1 us-east-1a 3h28m ip-10-0-129-226.ec2.internal aws:///us-east-1a/i-010ef6279b4662ced running
303-
clustername-8qw5l-worker-us-east-1b-lrdxb Running m4.large us-east-1 us-east-1b 3h28m ip-10-0-144-248.ec2.internal aws:///us-east-1b/i-0cb45ac45a166173b running
304-
clustername-8qw5l-worker-us-east-1c-pkg26 Running m4.large us-east-1 us-east-1c 3h28m ip-10-0-170-181.ec2.internal aws:///us-east-1c/i-06861c00007751b0a running
305-
----
306-
307-
.. Create the new machine using the `new-master-machine.yaml` file:
308-
+
309-
[source,terminal]
310-
----
311-
$ oc apply -f new-master-machine.yaml
312-
----
221+
A new machine is automatically provisioned after deleting the machine of the unhealthy member.
313222

314-
.. Verify that the new machine has been created:
223+
.. Verify that a new machine has been created:
315224
+
316225
[source,terminal]
317226
----

0 commit comments

Comments
 (0)