A tool for checking Kubernetes Pod health status, supporting TCP port probing and ICMP probing with retry mechanism.
- Support TCP port health check
- Support ICMP health check
- Configurable retry mechanism
- Configurable single probe timeout
- Parallel health checking
- Leader Election support
- Automatic Pod Readiness Gate updates
Environment Variable | Default Value | Description |
---|---|---|
HEALTH_CHECK_INTERVAL |
1s |
Health check interval |
HEALTH_CHECK_TIMEOUT |
1s |
Single probe timeout |
HEALTH_CHECK_CONCURRENCY |
10 |
Number of concurrent worker threads |
HEALTH_CHECK_RETRY_COUNT |
10 |
Health check retry count |
POD_NAME |
hostname | Pod name |
POD_NAMESPACE |
kube-system |
Pod namespace |
LEASE_NAME |
endpoint-health-checker-leader |
Leader election lease name |
LEASE_DURATION |
4s |
Leader election lease duration |
RENEW_DEADLINE |
2s |
Leader election renew deadline |
RETRY_PERIOD |
500ms |
Leader election retry period |
Health check supports configurable retry mechanism:
- Retry Count: Configured via
HEALTH_CHECK_RETRY_COUNT
environment variable, default is 10 times - Single Timeout: Configured via
HEALTH_CHECK_TIMEOUT
environment variable, default is 1 second - Retry Interval: 100ms delay between retries
The application uses Kubernetes Leader Election to ensure only one instance performs health checks:
- Lease Duration: How long the lease is valid (default: 4s)
- Renew Deadline: Maximum time to renew the lease (default: 2s)
- Retry Period: How often to retry acquiring the lease (default: 500ms)
- If Pod defines ports, perform TCP port probing
- If Pod has no defined ports, perform ICMP probing
- Retry specified number of times after each probe failure
- After all retries fail, set Pod's Readiness Gate to
False
- If any probe succeeds, set Pod's Readiness Gate to
True
helm install endpoint-health-checker ./charts/endpoint-health-checker
You can customize leader election parameters in the Helm values:
leaderElection:
leaseName: "endpoint-health-checker-leader"
leaseDuration: 4s
renewDeadline: 2s
retryPeriod: 500ms