Skip to content

Commit 965c6b6

Browse files
committed
[OSDOCS-11996]: Restructuring recommended etcd practices docs
1 parent bdfb354 commit 965c6b6

File tree

3 files changed

+54
-56
lines changed

3 files changed

+54
-56
lines changed

modules/etcd-verify-hardware.adoc

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * scalability_and_performance/recommended-performance-scale-practices/recommended-etcd-practices.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="etcd-verify-hardware_{context}"]
7+
= Validating the hardware for etcd
8+
9+
To validate the hardware for etcd before or after you create the {product-title} cluster, you can use fio.
10+
11+
.Prerequisites
12+
13+
* Container runtimes such as Podman or Docker are installed on the machine that you are testing.
14+
* Data is written to the `/var/lib/etcd` path.
15+
16+
.Procedure
17+
* Run fio and analyze the results:
18+
+
19+
--
20+
** If you use Podman, run this command:
21+
[source,terminal]
22+
+
23+
----
24+
$ sudo podman run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/cloud-bulldozer/etcd-perf
25+
----
26+
27+
** If you use Docker, run this command:
28+
[source,terminal]
29+
+
30+
----
31+
$ sudo docker run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/cloud-bulldozer/etcd-perf
32+
----
33+
--
34+
35+
The output reports whether the disk is fast enough to host etcd by comparing the 99th percentile of the fsync metric captured from the run to see if it is less than 10 ms. A few of the most important etcd metrics that might affected by I/O performance are as follows:
36+
37+
* `etcd_disk_wal_fsync_duration_seconds_bucket` metric reports the etcd's WAL fsync duration
38+
* `etcd_disk_backend_commit_duration_seconds_bucket` metric reports the etcd backend commit latency duration
39+
* `etcd_server_leader_changes_seen_total` metric reports the leader changes
40+
41+
Because etcd replicates the requests among all the members, its performance strongly depends on network input/output (I/O) latency. High network latencies result in etcd heartbeats taking longer than the election timeout, which results in leader elections that are disruptive to the cluster. A key metric to monitor on a deployed {product-title} cluster is the 99th percentile of etcd network peer latency on each etcd cluster member. Use Prometheus to track the metric.
42+
43+
The `histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[2m]))` metric reports the round trip time for etcd to finish replicating the client requests between the members. Ensure that it is less than 50 ms.

modules/recommended-etcd-practices.adoc

Lines changed: 8 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,15 @@
22
//
33
// * scalability_and_performance/recommended-performance-scale-practices/recommended-etcd-practices.adoc
44

5-
:_mod-docs-content-type: PROCEDURE
5+
:_mod-docs-content-type: CONCEPT
66
[id="recommended-etcd-practices_{context}"]
7-
= Recommended etcd practices
7+
= Storage practices for etcd
88

9-
Because etcd writes data to disk and persists proposals on disk, its performance depends on disk performance.
10-
Although etcd is not particularly I/O intensive, it requires a low latency block device for optimal performance and stability. Because etcd's consensus protocol depends on persistently storing metadata to a log (WAL), etcd is sensitive to disk-write latency. Slow disks and disk activity from other processes can cause long fsync latencies.
9+
Because etcd writes data to disk and persists proposals on disk, its performance depends on disk performance. Although etcd is not particularly I/O intensive, it requires a low latency block device for optimal performance and stability. Because the consensus protocol for etcd depends on persistently storing metadata to a log (WAL), etcd is sensitive to disk-write latency. Slow disks and disk activity from other processes can cause long fsync latencies.
1110

1211
Those latencies can cause etcd to miss heartbeats, not commit new proposals to the disk on time, and ultimately experience request timeouts and temporary leader loss. High write latencies also lead to an OpenShift API slowness, which affects cluster performance. Because of these reasons, avoid colocating other workloads on the control-plane nodes that are I/O sensitive or intensive and share the same underlying I/O infrastructure.
1312

14-
In terms of latency, run etcd on top of a block device that can write at least 50 IOPS of 8000 bytes long sequentially. That is, with a latency of 10ms, keep in mind that uses fdatasync to synchronize each write in the WAL. For heavy loaded clusters, sequential 500 IOPS of 8000 bytes (2 ms) are recommended. To measure those numbers, you can use a benchmarking tool, such as fio.
13+
Run etcd on a block device that can write at least 50 IOPS of 8KB sequentially, including fdatasync, in under 10ms. For heavy loaded clusters, sequential 500 IOPS of 8000 bytes (2 ms) are recommended. To measure those numbers, you can use a benchmarking tool, such as fio.
1514

1615
To achieve such performance, run etcd on machines that are backed by SSD or NVMe disks with low latency and high throughput. Consider single-level cell (SLC) solid-state drives (SSDs), which provide 1 bit per memory cell, are durable and reliable, and are ideal for write-intensive workloads.
1716

@@ -28,58 +27,13 @@ The following hard drive practices provide optimal etcd performance:
2827
* Prefer high-bandwidth reads for faster recovery from failures.
2928
* Use solid state drives as a minimum selection. Prefer NVMe drives for production environments.
3029
* Use server-grade hardware for increased reliability.
31-
32-
[NOTE]
33-
====
34-
Avoid NAS or SAN setups and spinning drives. Ceph Rados Block Device (RBD) and other types of network-attached storage can result in unpredictable network latency. To provide fast storage to etcd nodes at scale, use PCI passthrough to pass NVM devices directly to the nodes.
35-
====
36-
37-
Always benchmark by using utilities such as fio. You can use such utilities to continuously monitor the cluster performance as it increases.
38-
39-
[NOTE]
40-
====
41-
Avoid using the Network File System (NFS) protocol or other network based file systems.
42-
====
30+
* Avoid NAS or SAN setups and spinning drives. Ceph Rados Block Device (RBD) and other types of network-attached storage can result in unpredictable network latency. To provide fast storage to etcd nodes at scale, use PCI passthrough to pass NVM devices directly to the nodes.
31+
* Always benchmark by using utilities such as fio. You can use such utilities to continuously monitor the cluster performance as it increases.
32+
* Avoid using the Network File System (NFS) protocol or other network based file systems.
4333
4434
Some key metrics to monitor on a deployed {product-title} cluster are p99 of etcd disk write ahead log duration and the number of etcd leader changes. Use Prometheus to track these metrics.
4535

4636
[NOTE]
4737
====
4838
The etcd member database sizes can vary in a cluster during normal operations. This difference does not affect cluster upgrades, even if the leader size is different from the other members.
49-
====
50-
51-
To validate the hardware for etcd before or after you create the {product-title} cluster, you can use fio.
52-
53-
.Prerequisites
54-
55-
* Container runtimes such as Podman or Docker are installed on the machine that you are testing.
56-
* Data is written to the `/var/lib/etcd` path.
57-
58-
.Procedure
59-
* Run fio and analyze the results:
60-
+
61-
--
62-
** If you use Podman, run this command:
63-
[source,terminal]
64-
+
65-
----
66-
$ sudo podman run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/cloud-bulldozer/etcd-perf
67-
----
68-
69-
** If you use Docker, run this command:
70-
[source,terminal]
71-
+
72-
----
73-
$ sudo docker run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/cloud-bulldozer/etcd-perf
74-
----
75-
--
76-
77-
The output reports whether the disk is fast enough to host etcd by comparing the 99th percentile of the fsync metric captured from the run to see if it is less than 10 ms. A few of the most important etcd metrics that might affected by I/O performance are as follow:
78-
79-
* `etcd_disk_wal_fsync_duration_seconds_bucket` metric reports the etcd's WAL fsync duration
80-
* `etcd_disk_backend_commit_duration_seconds_bucket` metric reports the etcd backend commit latency duration
81-
* `etcd_server_leader_changes_seen_total` metric reports the leader changes
82-
83-
Because etcd replicates the requests among all the members, its performance strongly depends on network input/output (I/O) latency. High network latencies result in etcd heartbeats taking longer than the election timeout, which results in leader elections that are disruptive to the cluster. A key metric to monitor on a deployed {product-title} cluster is the 99th percentile of etcd network peer latency on each etcd cluster member. Use Prometheus to track the metric.
84-
85-
The `histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[2m]))` metric reports the round trip time for etcd to finish replicating the client requests between the members. Ensure that it is less than 50 ms.
39+
====

scalability_and_performance/recommended-performance-scale-practices/recommended-etcd-practices.adoc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,10 @@ include::_attributes/common-attributes.adoc[]
66

77
toc::[]
88

9-
This topic provides recommended performance and scalability practices for etcd in {product-title}.
9+
To ensure optimal performance and scalability for etcd in {product-title}, you can complete the following practices.
1010

1111
include::modules/recommended-etcd-practices.adoc[leveloffset=+1]
12+
include::modules/etcd-verify-hardware.adoc[leveloffset=+1]
1213

1314
[role="_additional-resources"]
1415
.Additional resources
@@ -27,6 +28,6 @@ include::modules/etcd-tuning-parameters.adoc[leveloffset=+1]
2728

2829
[role="_additional-resources"]
2930
.Additional resources
30-
xref:../../nodes/clusters/nodes-cluster-enabling-features.adoc#nodes-cluster-enabling-features-about_nodes-cluster-enabling[Understanding feature gates]
31+
* xref:../../nodes/clusters/nodes-cluster-enabling-features.adoc#nodes-cluster-enabling-features-about_nodes-cluster-enabling[Understanding feature gates]
3132
3233
include::modules/etcd-increase-db.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)