Skip to content

Feature: add dynamic interval support. #652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

remmen-io
Copy link

Dynamic Interval

Add a dynamic interval feature that automatically adjusts the time between pod terminations based on the number of candidate pods in your cluster. This helps ensure appropriate chaos levels in both small and large environments.

How it works

With dynamic interval enabled, chaoskube will calculate the interval between pod terminations using the following formula:

interval = totalWorkingMinutes / (podCount  * factor)

Where:

  • totalWorkingMinutes = 10 days * 8 hours * 60 minutes = 4800 minutes (we assume that all pods should be killed during 2 work weeks)
  • factor is the configurable dynamic interval factor

The dynamic interval factor lets you control the aggressiveness of the terminations:

  • With factor = 1.0: Standard interval calculation
  • With factor > 1.0: More aggressive terminations (shorter intervals)
  • With factor < 1.0: Less aggressive terminations (longer intervals)

Example scenarios

  • Small cluster (100 pods, factor 1.0): interval = 48 minutes
  • Small cluster (100 pods, factor 1.5): interval = 32 minutes
  • Small cluster (100 pods, factor 2.0): interval = 24 minutes
  • Large cluster (1500 pods, factor 1.0): interval = 3 minutes


pods = filterByAnnotations(pods, c.Annotations)

podCount := len(pods)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of finding all possible target pods (after filtering) should pretty much match the logic of Candidates() (https://github.com/linki/chaoskube/blob/master/chaoskube/chaoskube.go#L214). Let's try to re-use it.

Copy link
Author

@remmen-io remmen-io Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Candidates() filters out too much, which gives a false number to calculate the dynamic interval.
For instance, filterByMinimumAge or filterByOwnerReference, which do not make sense for the calculation.

Therefore I recreated the list by filtering only the relevant pods to calculate the interval

pods = filterByOwnerReference(pods)
c.Logger.WithFields(log.Fields{
"count": len(pods),
}).Debug("Final pod count after owner reference filtering")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could think about moving the logging to the end of each filterBy* function instead of having it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants