Merge pull request #14370 from mburke5678/logging-es-status

mburke5678 · web-flow · commit 4de10a0aa917 · 2019-04-11T15:02:49.000-04:00
[WIP] EO - ClusterLogging Status messages when cluster logging is not functional
diff --git a/logging/config/efk-logging-elasticsearch.adoc b/logging/config/efk-logging-elasticsearch.adoc
@@ -19,7 +19,7 @@ or higher memory. Each Elasticsearch node can operate with a lower memory settin
 
 [NOTE]
 ====
-Procedures in this topic require your cluster to be in an unmanaged state. For more information, see _Changing the cluster logging management state_.
+Procedures in this topic require your cluster to be in an unmanaged state. For more information, see Changing the cluster logging management state.
 ====
 
 // The following include statements pull in the module files that comprise
@@ -41,6 +41,10 @@ include::modules/efk-logging-elasticsearch-exposing.adoc[leveloffset=+1]
 
 include::modules/efk-logging-elasticsearch-rules.adoc[leveloffset=+1]
 
+include::modules/efk-logging-elasticsearch-status.adoc[leveloffset=+1]
+
+include::modules/efk-logging-elasticsearch-status-comp.adoc[leveloffset=+1]
+
 ////
 modules/efk-logging-elasticsearch-persistent-storage-persistent.adoc[leveloffset=+2]
 
@@ -49,7 +53,7 @@ modules/efk-logging-elasticsearch-persistent-storage-persistent-dynamic.adoc[lev
 modules/efk-logging-elasticsearch-persistent-storage-local.adoc[leveloffset=+2]
 ////
 
-== Additional Resources
+// == Additional Resources
 
-//For information on installing Elasticsearch, see xref:../../logging/efk-logging-deploying.adoc[Deploying cluster logging].
+//For information on installing Elasticsearch, see xref:../logging/efk-logging-deploy.adoc[Deploying cluster logging].
 
diff --git a/modules/efk-logging-elasticsearch-status-comp.adoc b/modules/efk-logging-elasticsearch-status-comp.adoc
@@ -0,0 +1,173 @@
+// Module included in the following assemblies:
+//
+// * logging/efk-logging-elasticsearch.adoc
+
+[id="efk-logging-elasticsearch-status-example-{context}"]
+= Viewing Elasticsearch component status
+
+You can view the status for a number of Elasticsearch components.
+
+Elasticsearch indices::
+You can view the status of the Elasticsearch indices.
+
+. Get the name of an Elasticsearch pod:
++
+----
+$ oc get pods --selector component=elasticsearch -o name
+
+pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
+pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
+pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7
+----
+
+. Get the status of the indices:
++
+----
+$ oc exec elasticsearch-cdm-1godmszn-1-6f8495-vp4lw -- indices
+
+Defaulting container name to elasticsearch.
+Use 'oc describe pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw -n openshift-logging' to see all of the containers in this pod.
+Wed Apr 10 05:42:12 UTC 2019
+health status index                                            uuid                   pri rep docs.count docs.deleted store.size pri.store.size
+red    open   .kibana.647a750f1787408bf50088234ec0edd5a6a9b2ac N7iCbRjSSc2bGhn8Cpc7Jg   2   1                                                  
+green  open   .operations.2019.04.10                           GTewEJEzQjaus9QjvBBnGg   3   1    2176114            0       3929           1956
+green  open   .operations.2019.04.11                           ausZHoKxTNOoBvv9RlXfrw   3   1    1494624            0       2947           1475
+green  open   .kibana                                          9Fltn1D0QHSnFMXpphZ--Q   1   1          1            0          0              0
+green  open   .searchguard                                     chOwDnQlSsqhfSPcot1Yiw   1   1          5            1          0              0
+----
+
+
+Elasticsearch pods::
+You can view the status of the Elasticsearch pods.
+
+. Get the name of a pod:
++
+----
+$ oc get pods --selector component=elasticsearch -o name
+
+pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
+pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
+pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7
+----
+
+. Get the status of a pod:
++
+----
+oc describe pod elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
+----
++
+The output includes the following status information:
++
+----
+....
+Status:             Running
+
+....
+
+Containers:
+  elasticsearch:
+    Container ID:   cri-o://b7d44e0a9ea486e27f47763f5bb4c39dfd2
+    State:          Running
+      Started:      Mon, 08 Apr 2019 10:17:56 -0400
+    Ready:          True
+    Restart Count:  0
+    Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
+
+....
+
+  proxy:
+    Container ID:  cri-o://3f77032abaddbb1652c116278652908dc01860320b8a4e741d06894b2f8f9aa1
+    State:          Running
+      Started:      Mon, 08 Apr 2019 10:18:38 -0400
+    Ready:          True
+    Restart Count:  0
+
+....
+
+Conditions:
+  Type              Status
+  Initialized       True 
+  Ready             True 
+  ContainersReady   True 
+  PodScheduled      True 
+
+....
+
+Events:          <none>
+----
+
+Elasticsearch deployment configuration::
+You can view the status of the Elasticsearch deployment configuration.
+
+. Get the name of a deployment configuration:
++
+----
+$ oc get deployment --selector component=elasticsearch -o name
+
+deployment.extensions/elasticsearch-cdm-1gon-1
+deployment.extensions/elasticsearch-cdm-1gon-2
+deployment.extensions/elasticsearch-cdm-1gon-3
+----
+
+. Get the deployment configuration status:
++
+----
+$ oc describe deployment elasticsearch-cdm-1gon-1
+----
++
+The output includes the following status information:
++
+----
+....
+  Containers:
+   elasticsearch:
+    Image:       quay.io/openshift/origin-logging-elasticsearch5:latest
+    Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
+
+....
+
+Conditions:
+  Type           Status   Reason
+  ----           ------   ------
+  Progressing    Unknown  DeploymentPaused
+  Available      True     MinimumReplicasAvailable
+
+....
+
+Events:          <none>
+----
+
+Elasticsearch ReplicaSet::
+You can view the status of the Elasticsearch ReplicaSet.
+
+. Get the name of a replica set:
++
+----
+$ oc get replicaSet --selector component=elasticsearch -o name
+
+replicaset.extensions/elasticsearch-cdm-1gon-1-6f8495
+replicaset.extensions/elasticsearch-cdm-1gon-2-5769cf
+replicaset.extensions/elasticsearch-cdm-1gon-3-f66f7d
+----
+
+. Get the status of the replica set:
++
+----
+$ oc describe replicaSet elasticsearch-cdm-1gon-1-6f8495
+----
++
+The output includes the following status information:
++
+----
+....
+  Containers:
+   elasticsearch:
+    Image:       quay.io/openshift/origin-logging-elasticsearch5:latest
+    Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
+
+....
+
+Events:          <none>
+----
+
+
diff --git a/modules/efk-logging-elasticsearch-status.adoc b/modules/efk-logging-elasticsearch-status.adoc
@@ -13,53 +13,118 @@ You can view the status of your Elasticsearch cluster.
 
 .Procedure
 
-Run the following command to view the Elasticsearch status:
-
-----
-$ oc get elasticsearch elasticsearch -o yaml
-
-nodes:
-  - deploymentName: elasticsearch-clientdatamaster-0-1
-    podName: elasticsearch-clientdatamaster-0-1-84d764899d-bh7jl
-    replicaSetName: elasticsearch-clientdatamaster-0-1-84d764899d
-    roles:
-    - client
-    - data
-    - master
-    status: Running
-    upgradeStatus:
-      underUpgrade: "False"
-  - deploymentName: elasticsearch-data-1-1
-    podName: elasticsearch-data-1-1-77ffddbf7b-zdd76
-    replicaSetName: elasticsearch-data-1-1-77ffddbf7b
-    roles:
-    - data
-    status: Running
-    upgradeStatus:
-      underUpgrade: "False"
-  - podName: elasticsearch-client-2-1-0
-    roles:
-    - client
-    statefulSetName: elasticsearch-client-2-1
-    status: Running
-    upgradeStatus:
-      underUpgrade: "False"
-  pods:
-    client:
-      failed: []
-      notReady:
-      - elasticsearch-client-1-1-0
-      - elasticsearch-client-2-1-0
-      ready: []
-    data:
-      failed: []
-      notReady:
-      - elasticsearch-data-1-1-77ffddbf7b-zdd76
-      ready: []
-    master:
-      failed: []
-      notReady: []
-      ready: []
-  shardAllocationEnabled: "True"
-----
-  
+. Change to the `openshift-logging` project.
++
+----
+$ oc project openshift-logging
+----
+
+. To view the Elasticsearch cluster status:
+
+.. Get the name of the Elasticsearch instance:
++
+----
+$ oc get Elasticsearch
+
+NAME            AGE
+elasticsearch   5h9m
+----
+
+.. Get the Elasticsearch status:
++
+----
+$ oc get Elasticsearch <Elasticsearch-instance> -o yaml
+----
++
+For example:
++
+----
+$ oc get Elasticsearch elasticsearch -n openshift-logging -o yaml
+----
++
+The output includes information similar to the following:
++
+----
+status: <1>
+  clusterHealth: green <2>
+  conditions: <3>
+      .....
+  nodes:  <4>
+      .....
+  pods: <5>
+      .....
+  shardAllocationEnabled: "True" <6>
+----
+<1> In the output, the cluster status fields appear in the `status` stanza.
+<2> The status of the Elasticsearch cluster, `green`, `red`, `yellow`.
+<3> Any status conditions, if present. The Elasticsearch cluster status indicates the reasons from the scheduler if a pod could not be placed. Any events related to the following conditions are shown:
+* Container Waiting for both the Elasticsearch and proxy containers,
+* Container Terminated for both the Elasticsearch and proxy containers,
+* Pod unschedulable.
+Also, a condition is shown if the node storage passes the high or low watermark thresholds.
+<4> The Elasticsearch nodes in the cluster, with `upgradeStatus`.  
+<5> The Elasticsearch client, data, and master pods in the cluster, listed under 'failed`, `notReady` or `ready` state.
+
+
+[id="efk-logging-elasticsearch-status-message-{context}"]
+== Example condition messages
+
+The following are examples of some condition messages from the `Status.Nodes` section of the Elasticsearch instance.
+
+
+// https://github.com/openshift/elasticsearch-operator/pull/92
+
+This status message indicates a node has exceeded the configured low watermark and no shard will be allocated to this node.
+
+----
+  nodes:
+  - conditions:
+    - lastTransitionTime: 2019-03-15T15:57:22Z
+      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be not
+        be allocated on this node.
+      reason: Disk Watermark Low
+      status: "True"
+      type: NodeStorage
+    deploymentName: example-elasticsearch-clientdatamaster-0-1
+    upgradeStatus: {}
+----
+
+This status message indicates a node has exceeded the configured high watermark and shard will be relocated to other nodes.
+
+----
+  nodes:
+  - conditions:
+    - lastTransitionTime: 2019-03-15T16:04:45Z
+      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be relocated
+        from this node.
+      reason: Disk Watermark High
+      status: "True"
+      type: NodeStorage
+    deploymentName: example-elasticsearch-clientdatamaster-0-1
+    upgradeStatus: {}
+----
+
+This status message shows the Elasticsearch node selector in the CR does not match any nodes in the cluster:
+
+----
+    nodes:
+    - conditions:
+      - lastTransitionTime: 2019-04-10T02:26:24Z
+        message: '0/8 nodes are available: 8 node(s) didn''t match node selector.'
+        reason: Unschedulable
+        status: "True"
+        type: Unschedulable
+----
+
+This status message indicates that the Elasticsearch CR uses a non-existent PVC.
+
+----
+   nodes:
+   - conditions:
+     - last Transition Time:  2019-04-10T05:55:51Z
+       message:               pod has unbound immediate PersistentVolumeClaims (repeated 5 times)
+       reason:                Unschedulable
+       status:                True
+       type:                  Unschedulable
+----
+