Skip to content

Commit 58b754a

Browse files
committed
Nodes edits
1 parent 0a38a53 commit 58b754a

20 files changed

+88
-96
lines changed

_topic_map.yml

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -392,14 +392,14 @@ Topics:
392392
File: nodes-scheduler-default
393393
- Name: Placing pods relative to other pods using pod affinity/anti-affinity rules
394394
File: nodes-scheduler-pod-affinity
395+
- Name: Controlling pod placement on nodes using node affinity rules
396+
File: nodes-scheduler-node-affinity
395397
- Name: Placing a pod on a specific node by name
396398
File: nodes-scheduler-node-names
397399
- Name: Placing a pod in a specific project
398400
File: nodes-scheduler-node-projects
399401
- Name: Placing pods onto overcommited nodes
400402
File: nodes-scheduler-overcommit
401-
- Name: Controlling pod placement on nodes using node affinity rules
402-
File: nodes-scheduler-node-affinity
403403
- Name: Controlling pod placement using node taints
404404
File: nodes-scheduler-taints-tolerations
405405
- Name: Constraining pod placement using node selectors
@@ -482,8 +482,6 @@ Topics:
482482
File: efk-logging
483483
- Name: Deploying cluster logging
484484
File: efk-logging-deploy
485-
- Name: Viewing the Kibana interface
486-
File: efk-logging-kibana-interface
487485
- Name: Changing cluster logging management state
488486
File: efk-logging-management
489487
- Name: Configuring cluster logging

modules/nodes-cluster-overcommit-configure-nodes.adoc

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,17 +12,38 @@ When the node starts, it ensures that the kernel tunable flags for memory
1212
management are set properly. The kernel should never fail memory allocations
1313
unless it runs out of physical memory.
1414

15-
To ensure this behavior, the node instructs the kernel to always overcommit
16-
memory:
15+
In an overcommitted environment, it is important to properly configure your node to provide best system behavior.
16+
17+
When the node starts, it ensures that the kernel tunable flags for memory
18+
management are set properly. The kernel should never fail memory allocations
19+
unless it runs out of physical memory.
20+
21+
To ensure this behavior, {product-title} configures the kernel to always overcommit
22+
memory by setting the `vm.overcommit_memory` parameter to `1`, overriding the
23+
default operating system setting.
24+
25+
{product-title} also configures the kernel not to panic when it runs out of memory
26+
by setting the `vm.panic_on_oom` parameter to `0`. A setting of 0 instructs the
27+
kernel to call oom_killer in an Out of Memory (OOM) condition, which kills
28+
processes based on priority
29+
30+
You can view the current setting by running the following commands on your node:
1731

1832
----
19-
$ sysctl -w vm.overcommit_memory=1
33+
$ sysctl -a |grep commit
34+
35+
vm.overcommit_memory = 0
2036
----
2137

22-
The node also instructs the kernel not to panic when it runs out of memory.
23-
Instead, the kernel OOM killer should kill processes based on priority:
38+
----
39+
$ sysctl -a |grep panic
40+
vm.panic_on_oom = 0
41+
----
42+
43+
You can change these settings using:
2444

2545
----
46+
$ sysctl -w vm.overcommit_memory=1
2647
$ sysctl -w vm.panic_on_oom=0
2748
----
2849

modules/nodes-cluster-overcommit-master-disabling-swap.adoc

Lines changed: 0 additions & 22 deletions
This file was deleted.

modules/nodes-cluster-resource-levels-command.adoc

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,19 @@
55
[id="nodes-cluster-resource-levels-command-{context}"]
66
= Running the cluster capacity tool on the command line
77

8-
You can run the {product-title} capacity tool from the command line
8+
You can run the {product-title} cluster capacity tool from the command line
99
to estimate the number of pods that can be scheduled onto your cluster.
1010

1111
.Prerequisites
1212

13-
A sample pod specification file, which the tool
14-
uses for estimating resource usage. The `podspec` specifies its resource
13+
* Download and install link:https://github.com/kubernetes-incubator/cluster-capacity[the *cluster-capacity* tool].
14+
15+
* Create a sample pod specification file, which the tool uses for estimating resource usage. The `podspec` specifies its resource
1516
requirements as `limits` or `requests`. The cluster capacity tool takes the
1617
pod's resource requirements into account for its estimation analysis.
17-
18+
+
1819
An example of the pod specification input is:
19-
20+
+
2021
[source,yaml]
2122
----
2223
apiVersion: v1
@@ -48,7 +49,7 @@ To run the tool on the command line:
4849
. Run the following command:
4950
+
5051
----
51-
$ cluster-capacity --kubeconfig <path-to-kubeconfig> \ <1>
52+
$ ./cluster-capacity --kubeconfig <path-to-kubeconfig> \ <1>
5253
--podspec <path-to-pod-spec> <2>
5354
----
5455
<1> Specify the path to your Kubernetes configuration file.
@@ -58,7 +59,7 @@ You can also add the `--verbose` option to output a detailed description of how
5859
many pods can be scheduled on each node in the cluster:
5960
+
6061
----
61-
$ cluster-capacity --kubeconfig <path-to-kubeconfig> \
62+
$ ./cluster-capacity --kubeconfig <path-to-kubeconfig> \
6263
--podspec <path-to-pod-spec> --verbose
6364
----
6465

modules/nodes-cluster-resource-levels-job.adoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ Running the cluster capacity tool as a job inside of a pod has the advantage of
99
being able to be run multiple times without needing user intervention. Running
1010
the cluster capacity tool as a job involves using a `ConfigMap`.
1111

12+
.Prerequisites
13+
14+
Download and install link:https://github.com/kubernetes-incubator/cluster-capacity[the *cluster-capacity* tool].
15+
1216
.Procedure
1317

1418
To run the cluster capacity tool:

modules/nodes-containers-downward-api-container-configmaps.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ apiVersion: v1
3636
kind: Pod
3737
metadata:
3838
name: dapi-env-test-pod
39-
spec:bash
39+
spec:
4040
containers:
4141
- name: env-test-container
4242
image: gcr.io/google_containers/busybox
@@ -47,7 +47,7 @@ spec:bash
4747
configMapKeyRef:
4848
name: myconfigmap
4949
key: mykey
50-
restartPolicy: Never
50+
restartPolicy: Always
5151
----
5252

5353
. Create the pod from the `*_pod.yaml_*` file:

modules/nodes-containers-events-viewing.adoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,11 @@ $ oc get events [-n <project>] <1>
1616
----
1717
<1> The name of the project.
1818

19-
* To view events in your project from the web console.
19+
* To view events in your project from the {product-title} console.
2020
+
21-
. Launch the web console.
21+
. Launch the {product-title} console.
2222
+
23-
. Launch the *Browse* -> *Events* page.
23+
. Click *Home* -> *Events* and select your project.
2424
+
2525
Many other objects, such as pods and deployments, have their own
2626
*Events* tab as well, which shows events related to that object.

modules/nodes-nodes-garbage-collection-configuring.adoc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,12 +56,12 @@ spec:
5656
matchLabels:
5757
custom-kubelet: small-pods <2>
5858
kubeletConfig:
59-
ImageMinimumGCAge: <3>
60-
ImageGCHighThresholdPercent: <4>
61-
ImageGCLowThresholdPercent: <5>
59+
ImageMinimumGCAge: 0 <3>
60+
ImageGCHighThresholdPercent: 85 <4>
61+
ImageGCLowThresholdPercent: 80 <5>
6262
----
6363
<1> Assign a name to CR.
6464
<2> Specify the label to apply the configuration change.
65-
<3> Specify the minimum age for an unused image before it is garbage collected
65+
<3> Specify the minimum age for an unused image before it is garbage collected. A value of `0` means no limit.
6666
<4> Specify the percent of disk usage after which image garbage collection is always run.
6767
<5> Specify the percent of disk usage before which image garbage collection is never run.

modules/nodes-nodes-problem-detector-installing.adoc

Lines changed: 2 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -15,41 +15,20 @@ You can use the {product-title} console to install the Node Problem Detector Ope
1515
$ oc adm new-project openshift-node-problem-detector --node-selector ""
1616
----
1717

18-
. Create an Operator Group:
19-
20-
.. Add the followng code to a YAML file:
21-
+
22-
----
23-
apiVersion: operators.coreos.com/v1alpha2
24-
kind: OperatorGroup
25-
metadata:
26-
name: npd-operators
27-
namespace: openshift-node-problem-detector
28-
spec:
29-
targetNamespaces:
30-
- openshift-node-problem-detector
31-
----
32-
33-
.. Create the Operator Group:
34-
+
35-
----
36-
$ oc create -f -<file-name>.yaml
37-
----
38-
3918
.Procedure
4019

4120
The process to install the Node Problem Detector involves installing the Node Problem Detector Operator and creating a Node Problem Detector instance.
4221

4322
. In the {product-title} console, click *Catalog* -> *OperatorHub*.
4423

24+
. Choose *Node Problem Detector* from the list of available Operators, and click *Install*.
25+
4526
. On the *Create Operator Subscription* page:
4627

4728
.. Select the `openshift-node-problem-detector` project from the *A specific namespace on the cluster* drop-down list.
4829

4930
.. Click *Subscribe*.
5031

51-
.. Click *Subscribe*.
52-
5332
. On the *Catalog* → *Installed Operators* page, verify that the NodeProblemDetector (CSV) eventually shows up and its *Status* ultimately resolves to *InstallSucceeded*.
5433
+
5534
If it does not, switch to the *Catalog* → *Operator Management* page and inspect the *Operator Subscriptions* and *Install Plans* tabs for any failure or errors under *Status*. Then, check the logs in any Pods in the openshift-operators project (on the *Workloads* → *Pods* page) that are reporting issues to troubleshoot further.

modules/nodes-nodes-rebooting-infrastructure.adoc

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,14 @@
33
// * nodes/nodes-nodes-rebooting.adoc
44

55
[id="nodes-nodes-rebooting-infrastructure-{context}"]
6-
= Understanding infrastructire node rebooting in {product-title}
6+
= Understanding infrastructure node rebooting in {product-title}
77

88
Infrastructure nodes are nodes that are labeled to run pieces of the
99
{product-title} environment. Currently, the easiest way to manage node reboots
1010
is to ensure that there are at least three nodes available to run
11-
infrastructure. The scenario below demonstrates a common mistake that can lead
11+
infrastructure. The nodes to run the infrastructure are called *master* nodes.
12+
13+
The scenario below demonstrates a common mistake that can lead
1214
to service interruptions for the applications running on {product-title} when
1315
only two nodes are available.
1416

@@ -19,7 +21,7 @@ node B is now running both registry pods.
1921
- The service exposing the two pod endpoints on node B, for a brief period of
2022
time, loses all endpoints until they are redeployed to node A.
2123

22-
The same process using three infrastructure nodes does not result in a service
24+
The same process using three master nodes for infrastructure does not result in a service
2325
disruption. However, due to pod scheduling, the last node that is evacuated and
2426
brought back in to rotation is left running zero registries. The other two nodes
2527
will run two and one registries respectively. The best solution is to rely on

0 commit comments

Comments
 (0)