Skip to content

An easier to use and smarter etcd defragmentation tool

License

ahrtr/etcd-defrag

Repository files navigation

etcd-defrag

Table of Contents

Overview

etcd-defrag is an easier to use and smarter etcd defragmentation tool. It references the implementation of etcdctl defrag command, but with big refactoring and extra enhancements below,

  • check the status of all members, and stop the operation if any member is unhealthy. Note that it ignores the NOSPACE alarm
  • run defragmentation on the leader last
  • support rule based defragmentation

etcd-defrag reuses all the existing flags accepted by etcdctl defrag, so basically it doesn't break any existing user experience, but with additional benefits. Users can just replace etcdctl defrag [flags] with etcd-defrag [flags] without compromising any experience.

It adds the following extra flags,

Flag Description
---compaction whether execute compaction before the defragmentation, defaults to true
--continue-on-error whether continue to defragment next endpoint if current one fails, defaults to true
--etcd-storage-quota-bytes etcd storage quota in bytes (the value passed to etcd instance by flag --quota-backend-bytes), defaults to 2*1024*1024*1024
--defrag-rule defragmentation rule (etcd-defrag will run defragmentation if the rule is empty or it is evaluated to true), defaults to empty. See more details below.
--dry-run evaluate whether or not endpoints require defragmentation, but don't actually perform it, defaults to false.
--exclude-localhost whether to exclude localhost endpoints, defaults to false.
--move-leader whether to move the leadership before performing defragmentation on the leader, defaults to false.
--wait-between-defrags wait time between consecutive defragmentation runs or after a leader movement (if --move-leader is enabled). Defaults to 0s (no wait)
--skip-healthcheck-cluster-endpoints skip cluster endpoint discovery during health check and only check the endpoints provided via --endpoints, defaults to false.
--auto-disalarm automatically disalarm NOSPACE alarms after successful defragmentation, defaults to false.
--disalarm-threshold threshold ratio for auto-disalarm (db size / quota), only disalarm when all members are below this threshold, defaults to 0.9.

See the complete flags below,

$ ./etcd-defrag -h
A simple command line tool for etcd defragmentation

Usage:
  etcd-defrag [flags]

Flags:
      --cacert string                        verify certificates of TLS-enabled secure servers using this CA bundle
      --cert string                          identify secure client using this TLS certificate file
      --cluster                              use all endpoints from the cluster member list
      --command-timeout duration             command timeout (excluding dial timeout) (default 30s)
      --compaction                           whether execute compaction before the defragmentation (defaults to true) (default true)
      --continue-on-error                    whether continue to defragment next endpoint if current one fails (default true)
      --defrag-rule string                   defragmentation rule (etcd-defrag will run defragmentation if the rule is empty or it is evaluated to true)
      --dial-timeout duration                dial timeout for client connections (default 2s)
  -d, --discovery-srv string                 domain name to query for SRV records describing cluster endpoints
      --discovery-srv-name string            service name to query when using DNS discovery
      --dry-run                              evaluate whether or not endpoints require defragmentation, but don't actually perform it
      --endpoints strings                    comma separated etcd endpoints (default [127.0.0.1:2379])
      --etcd-storage-quota-bytes int         etcd storage quota in bytes (the value passed to etcd instance by flag --quota-backend-bytes) (default 2147483648)
      --exclude-localhost                    whether to exclude localhost endpoints
  -h, --help                                 help for etcd-defrag
      --insecure-discovery                   accept insecure SRV records describing cluster endpoints (default true)
      --insecure-skip-tls-verify             skip server certificate verification (CAUTION: this option should be enabled only for testing purposes)
      --insecure-transport                   disable transport security for client connections (default true)
      --keepalive-time duration              keepalive time for client connections (default 2s)
      --keepalive-timeout duration           keepalive timeout for client connections (default 6s)
      --key string                           identify secure client using this TLS key file
      --move-leader                          whether to move the leadership before performing defragmentation on the leader
      --password string                      password for authentication (if this option is used, --user option shouldn't include password)
      --skip-healthcheck-cluster-endpoints   skip cluster endpoint discovery during health check and only check the endpoints provided via --endpoints
      --wait-between-defrags                 wait time between consecutive defragmentation runs or after a leader movement (if --move-leader is enabled). Defaults to 0s (no wait)
      --user string                          username[:password] for authentication (prompt if password is not supplied)
      --auto-disalarm                        whether automatically disalarm NOSPACE alarms after successful defragmentation(default false)
      --disalarm-threshold float             threshold ratio for automatic alarm clearing (db size / quota). Valid range: 0 < x < 1 (default: 0.9)
      --version                              print the version and exit

Environment variables can be used to set the flags, by setting the flag name in uppercase and prefixing it with ETCD_DEFRAG_. Please note that all hyphens should be replaced with underscores. For example, the flag --move-leader can be set with the environment variable ETCD_DEFRAG_MOVE_LEADER

Flag values are evaluated in the following order: (from highest to lowest priority)

  1. Flags passed as command line arguments
  2. Environment variables
  3. Default values

Integration with Kubernetes with a CronJob

It is possible to use the example cronjob in ./doc/etcd-defrag-cronjob.yaml on Kubernetes environments where the etcd servers are colocated with the control plane nodes.

This example CronJob runs every weekday in the morning, and works by mounting the /etc/kubernetes/pki/etcd folder inside the pod, thereby permitting to defragment the etcd cluster inside the Kubernetes cluster itself. For more complex use cases you might to adapt the --endpoints and/or the certificates.

The example CronJob is per default configured with node-role.kubernetes.io/control-plane affinity, and with the hostNetwork: true spec, so that the etcd server co-located on the apiserver can be reached directly with 127.0.0.1:2379.

Examples

Example 1: run defragmentation on one endpoint

Command:

$ ./etcd-defrag --endpoints=https://127.0.0.1:22379 --cacert ./ca.crt --key ./etcd-defrag.key --cert ./etcd-defrag.crt

Example 2: run defragmentation on multiple endpoints

Command:

$ ./etcd-defrag --endpoints=https://127.0.0.1:22379,https://127.0.0.1:32379 --cacert ./ca.crt --key ./etcd-defrag.key --cert ./etcd-defrag.crt

Example 3: run defragmentation on all members in the cluster

Command:

$ ./etcd-defrag --endpoints https://127.0.0.1:22379 --cluster --cacert ./ca.crt --key ./etcd-defrag.key --cert ./etcd-defrag.crt

Output:

2025/08/23 13:00:04 Validating configuration.
2025/08/23 13:00:04 No defragmentation rule provided
2025/08/23 13:00:04 Performing health check.
2025/08/23 13:00:04 endpoint: https://127.0.0.1:22379, health: true, took: 1.902417ms, error:
2025/08/23 13:00:04 endpoint: https://127.0.0.1:2379, health: true, took: 1.893833ms, error:
2025/08/23 13:00:04 endpoint: https://127.0.0.1:32379, health: true, took: 2.167917ms, error:
2025/08/23 13:00:04 Getting members status
2025/08/23 13:00:04 endpoint: https://127.0.0.1:22379, dbSize: 98304, dbSizeInUse: 98304, memberId: 91bc3c398fb3c146, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 22
2025/08/23 13:00:04 endpoint: https://127.0.0.1:2379, dbSize: 98304, dbSizeInUse: 98304, memberId: 8211f1d0f64f3269, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 22
2025/08/23 13:00:04 endpoint: https://127.0.0.1:32379, dbSize: 98304, dbSizeInUse: 98304, memberId: fd422379fda50e48, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 22
2025/08/23 13:00:04 Running compaction until revision: 12 ...
2025/08/23 13:00:05 successful
2025/08/23 13:00:05 3 endpoint(s) need to be defragmented: [https://127.0.0.1:2379 https://127.0.0.1:32379 https://127.0.0.1:22379]
2025/08/23 13:00:05 [Before defragmentation]
2025/08/23 13:00:05 endpoint: https://127.0.0.1:2379, dbSize: 98304, dbSizeInUse: 98304, memberId: 8211f1d0f64f3269, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 23
2025/08/23 13:00:05 Defragmenting endpoint "https://127.0.0.1:2379"
2025/08/23 13:00:05 Finished defragmenting etcd endpoint "https://127.0.0.1:2379". took 28.41525ms
2025/08/23 13:00:05 [Post defragmentation]
2025/08/23 13:00:05 endpoint: https://127.0.0.1:2379, dbSize: 98304, dbSizeInUse: 65536, memberId: 8211f1d0f64f3269, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 23
2025/08/23 13:00:05 [Before defragmentation]
2025/08/23 13:00:05 endpoint: https://127.0.0.1:32379, dbSize: 98304, dbSizeInUse: 98304, memberId: fd422379fda50e48, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 23
2025/08/23 13:00:05 Defragmenting endpoint "https://127.0.0.1:32379"
2025/08/23 13:00:05 Finished defragmenting etcd endpoint "https://127.0.0.1:32379". took 27.834208ms
2025/08/23 13:00:05 [Post defragmentation]
2025/08/23 13:00:05 endpoint: https://127.0.0.1:32379, dbSize: 98304, dbSizeInUse: 65536, memberId: fd422379fda50e48, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 23
2025/08/23 13:00:05 [Before defragmentation]
2025/08/23 13:00:05 endpoint: https://127.0.0.1:22379, dbSize: 98304, dbSizeInUse: 98304, memberId: 91bc3c398fb3c146, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 23
2025/08/23 13:00:05 Defragmenting endpoint "https://127.0.0.1:22379"
2025/08/23 13:00:05 Finished defragmenting etcd endpoint "https://127.0.0.1:22379". took 43.494ms
2025/08/23 13:00:05 [Post defragmentation]
2025/08/23 13:00:05 endpoint: https://127.0.0.1:22379, dbSize: 98304, dbSizeInUse: 65536, memberId: 91bc3c398fb3c146, leader: 91bc3c398fb3c146, revision: 12, term: 2, index: 23
2025/08/23 13:00:05 The defragmentation is successful.

Only one endpoint is provided, but it still runs defragmentation on all members in the cluster thanks to the flag --cluster. Note that the endpoint https://127.0.0.1:2379 is the leader, so it's placed at the end of the list,

3 endpoint(s) need to be defragmented: [https://127.0.0.1:22379 https://127.0.0.1:32379 https://127.0.0.1:2379]
$ etcdctl endpoint status -w table --cluster
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|        ENDPOINT         |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|  https://127.0.0.1:2379 | 8211f1d0f64f3269 |   3.5.8 |   25 kB |      true |      false |        10 |        164 |                164 |        |
| https://127.0.0.1:22379 | 91bc3c398fb3c146 |   3.5.8 |   25 kB |     false |      false |        10 |        164 |                164 |        |
| https://127.0.0.1:32379 | fd422379fda50e48 |   3.5.8 |   25 kB |     false |      false |        10 |        164 |                164 |        |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

Defragmentation rule

Defragmentation is an expensive operation, so it should be executed as infrequent as possible. On the other hand, it's also necessary to make sure any etcd member will not run out of the storage quota. It's exactly the reason why the defragmentation rule is introduced, it can skip unnecessary expensive defragmentation, and also keep each member safe.

Users can configure a defragmentation rule using the flag --defrag-rule. The rule must be a boolean expression, which means its evaluation result should be a boolean value. It supports arithmetic (e.g. + - * / %) and logic (e.g. == != < > <= >= && || !) operators supported by golang. Parenthesis () can be used to control precedence.

Currently, etcd-defrag supports three variables below,

Variable name Description
dbSize total size of the etcd database
dbSizeInUse total size in use of the etcd database
dbSizeFree total size not in use of the etcd database, defined as dbSize - dbSizeInUse
dbQuota etcd storage quota in bytes (the value passed to etcd instance by flag --quota-backend-bytes)
dbQuotaUsage total usage of the etcd storage quota, defined as dbSize/dbQuota

For example, if you want to run defragmentation if the total db size is greater than 80% of the quota OR there is at least 200MiB free space, the defragmentation rule is dbSize > dbQuota*80/100 || dbSize - dbSizeInUse > 200*1024*1024. The complete command is below,

$ ./etcd-defrag --endpoints http://127.0.0.1:22379 --cluster --defrag-rule="dbSize > dbQuota*80/100 || dbSize - dbSizeInUse > 200*1024*1024"

Or,

$ ./etcd-defrag --endpoints http://127.0.0.1:22379 --cluster --defrag-rule="dbQuotaUsage > 0.8 || dbSizeFree > 200*1024*1024"

Output:

2025/08/23 12:55:09 Validating configuration.
2025/08/23 12:55:09 Validating the defragmentation rule: dbQuotaUsage > 0.8 || dbSizeFree > 200*1024*1024 ...
2025/08/23 12:55:09 valid
2025/08/23 12:55:09 Performing health check.
2025/08/23 12:55:09 endpoint: http://127.0.0.1:2379, health: true, took: 2.73825ms, error:
2025/08/23 12:55:09 endpoint: http://127.0.0.1:22379, health: true, took: 2.839ms, error:
2025/08/23 12:55:09 endpoint: http://127.0.0.1:32379, health: true, took: 2.96325ms, error:
2025/08/23 12:55:09 Getting members status
2025/08/23 12:55:09 endpoint: http://127.0.0.1:22379, dbSize: 98304, dbSizeInUse: 98304, memberId: 91bc3c398fb3c146, leader: 8211f1d0f64f3269, revision: 9, term: 4, index: 44
2025/08/23 12:55:09 endpoint: http://127.0.0.1:2379, dbSize: 98304, dbSizeInUse: 98304, memberId: 8211f1d0f64f3269, leader: 8211f1d0f64f3269, revision: 9, term: 4, index: 44
2025/08/23 12:55:09 endpoint: http://127.0.0.1:32379, dbSize: 98304, dbSizeInUse: 98304, memberId: fd422379fda50e48, leader: 8211f1d0f64f3269, revision: 9, term: 4, index: 44
2025/08/23 12:55:09 Running compaction until revision: 9 ...
2025/08/23 12:55:09 successful
2025/08/23 12:55:09 3 endpoint(s) need to be defragmented: [http://127.0.0.1:22379 http://127.0.0.1:32379 http://127.0.0.1:2379]
2025/08/23 12:55:09 [Before defragmentation]
2025/08/23 12:55:09 endpoint: http://127.0.0.1:22379, dbSize: 98304, dbSizeInUse: 98304, memberId: 91bc3c398fb3c146, leader: 8211f1d0f64f3269, revision: 9, term: 4, index: 45
2025/08/23 12:55:09 Evaluation result is false, so skipping endpoint: http://127.0.0.1:22379
2025/08/23 12:55:09 [Before defragmentation]
2025/08/23 12:55:09 endpoint: http://127.0.0.1:32379, dbSize: 98304, dbSizeInUse: 98304, memberId: fd422379fda50e48, leader: 8211f1d0f64f3269, revision: 9, term: 4, index: 45
2025/08/23 12:55:09 Evaluation result is false, so skipping endpoint: http://127.0.0.1:32379
2025/08/23 12:55:09 [Before defragmentation]
2025/08/23 12:55:09 endpoint: http://127.0.0.1:2379, dbSize: 98304, dbSizeInUse: 98304, memberId: 8211f1d0f64f3269, leader: 8211f1d0f64f3269, revision: 9, term: 4, index: 45
2025/08/23 12:55:09 Evaluation result is false, so skipping endpoint: http://127.0.0.1:2379
2025/08/23 12:55:09 The defragmentation is successful.

If you want to run defragmentation when both conditions are true, namely the total db size is greater than 80% of the quota AND there is at least 200MiB free space, then run command below,

$ ./etcd-defrag --endpoints http://127.0.0.1:22379 --cluster --defrag-rule="dbSize > dbQuota*80/100 && dbSize - dbSizeInUse > 200*1024*1024"

Auto-disalarm Feature

The auto-disalarm feature automatically removes NOSPACE alarms if any after successful defragmentation when certain conditions are met. This helps maintain cluster health by clearing alarms that are no longer relevant after freeing up space through defragmentation.

How it works

When --auto-disalarm is enabled, etcd-defrag will:

  1. Check if there are any NOSPACE alarms in the cluster after defragmentation
  2. Verify that all cluster members' database size is below the specified threshold
  3. Automatically disalarm NOSPACE alarms if both conditions are met

Configuration

  • --auto-disalarm: whether automatically disalarm NOSPACE alarms after successful defragmentation(default false)
  • --disalarm-threshold: Threshold ratio for automatic alarm clearing (db size / quota). Valid range: 0 < x < 1 (default: 0.9)

Example Usage

# Enable auto-disalarm with default threshold (0.9)
$ ./etcd-defrag --endpoints=https://127.0.0.1:2379 --cluster --auto-disalarm

# Enable auto-disalarm with custom threshold (0.7)
$ ./etcd-defrag --endpoints=https://127.0.0.1:2379 --cluster --auto-disalarm --disalarm-threshold=0.7

Safety Considerations

  • Auto-disalarm only triggers when all cluster members's DB sizes are below the threshold.

  • The threshold is highly dependent on the --etcd-storage-quota-bytes flag, which defaults to 2147483648 (2 GiB) in etcd-defrag. The formula is as follows:

    threshold = etcd-storage-quota-bytes * disalarm-threshold
    

    Please ensure that you provide the correct value; otherwise, unexpected behavior may occur, such as the disalarm operation not being triggered.

  • This feature only affects NOSPACE alarms; other alarm types are not affected.

  • The value of --disalarm-threshold must be between 0 and 1.0 (0 < x < 1).

Container image

Container images are released automatically using GitHub actions and ko-build/ko. They can be used as follows:

$ docker pull ghcr.io/ahrtr/etcd-defrag:latest

Alternatively, you can build your own container images with:

$ DOCKER_BUILDKIT=1 docker build -t "etcd-defrag:${VERSION}" -f Dockerfile .

If you need an image for another GOARCH (e.g. ppc64le or s390x) other than amd64 or arm64, use a command something like below,

$ DOCKER_BUILDKIT=1 docker build --build-arg ARCH=${ARCH} -t "etcd-defrag:${VERSION}" -f Dockerfile .

Contributing

Any contribution is welcome!

Note

About

An easier to use and smarter etcd defragmentation tool

Topics

Resources

License

Stars

Watchers

Forks

Packages