-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Per title, this issue/RFE is simply for our team to consider the potential for fully removing the embedded helm-controller.
Summary
The primary objective of this effort would be to decouple the helm-controller from the PromFed binary. In turn directly reducing the complexity of delivering a PromFed that is compatible with all target distros (rke2/k3s, cloud, etc) and varying k8s versions.
The PromFed project and chart would still depend on helm-controller but instead of building into the binary would consume the original image. This will de-duplicate redundant efforts across projects and helm-controller systems within a cluster.
Finally, once the PromFed binary built in version is ready for removal it will simplify the PromFed project further by allowing us to remove the complicated helm-controller crd management we've needed to fix other bugs on k3s/rke2 systems.
Solution
Remove helm-controller as a godep for PromFed, make it a sibling container instead
This option entails keeping the existing functionality we surface to PromFed users today with a different implementation at a k8s level.Instead of a true "embedded" controller in PromFed binary, it would become a new helm-controller pod or deployment. The goal would be a compatible end-user experience with new opt-in helm fields to use the new mechanism.
The end result is still "embedded" - in the experience of using the chart - but not statically at compile time.
While this option has some considerable cons (see original issue text) - the Pros greatly outweigh those risks. We can mitigate most of those risks by creating good documentation - both formal docs, release notes and helm-chart docs (in values.yaml).
Background
3 Simple Supported Config Setups:
- A) PromFed is the only possible
helm-controlleron Cluster:- The
helm-controllerw/ PromFed must be used (will now be a pod).
- The
- B)PromFed is on Cluster with existing Global
helm-controller:- Cannot have PromFed's
helm-controllerenabled. - Based on talking with @brandond, mixing global and namespace scoped helm-controller instances is not fully supported.
- It may work but also may have weirdness - if it needs to be supported Team ORBS can work in k3s upstream for
helm-controllerto add support properly.
- It may work but also may have weirdness - if it needs to be supported Team ORBS can work in k3s upstream for
- Cannot have PromFed's
- C) PromFed is on a Cluster with Namespace scoped controller instances:
- User's can either: a) deploy one for PromFed like their existing
helm-controllerinstances, or b) use the PromFed one setting thevalues.yamlto match thehelm-controllerversion.
- User's can either: a) deploy one for PromFed like their existing
Important Context:
helm-controllersupports either: a single global controller watching every NS, or many NS scoped instances.helm-controllerdoesn't support a mixture of both of these modes in the same cluster.- Today's PromFed integration for helm-controller leads to conflicts easily even on a single Rancher Minor version, because:
- The existing PromFed embedded
helm-controlleris always locked at a single version (at build time). - k3s/RKE2 will update
helm-controllerversions on patch releases. - This can lead to CRD version conflicts and other weird issues
- The existing PromFed embedded
- The existing PromFed embedded
helm-controllerboth:- Installs under a new name, and
- Installs under a namespace, so becomes namespace scoped.
- Technically, the
ManagedByannotation thathelm-controllersupports does allow multiple controller instances to have overlapping namespace scopes w/o conflict via Alternative names.- In other words, mixing global and namespaced ones works with specific steps taken - as that's partially how PromFed can co-exist with global ones with proper configs.
- The lease mechanism will allow both: global and/or many ns-scoped instances to hold a lease.
- Using both together isn't officially supported, as mentioned above, but technically possible as long as each NS specific instance has a custom
ControllerNameset to differentiateManagedByand ensure global instance won't touch the ones for NS-scoped controllers. - The current
helm-controllerlease mechanism will not allow locks for: multiple global instances, or multiple ns-scoped instances within the same ns. (In this case, a second Global instance lock will overlap with the first and similarly any additional ns-scoped in the same NS overlap.)
- Using both together isn't officially supported, as mentioned above, but technically possible as long as each NS specific instance has a custom
The 1 Secret Unsupported Config:
This option is specific to k3s/RKE2 clusters only. We will call it B.2 - because it's option B, but we keep the new "external but embedded" helm-controller enabled.
It is technically not officially supported or consider supported by k3s upstream helm-controller project. As such it may not need to be tested, but because it is possible I wanted to document it here. The configuration is rather easy but does take an additional step and extra attention to detail - this is another reason why the documented option B is preferred to this route.
How?
- Identify the version of k3s/RKE2 in use,
- Pull up the release notes for that k8s minor version of k3s/RKE2,
- Find the specific version in the table and make note of the
helm-controllercolumns value, - During PromFed install set the
.Values.helmController.deployment.image.tagvalue to match, - Also follow other "normal testing steps" for this new mode,
- Observe the new PromFed deployment/pod for
helm-controllerthat should use the version you set.
Example: Assume we are using RKE2 v1.33.5+rke2r1 we would look at Release Notes and find that version uses helm-controller of version v0.16.13. That is the version we would set for .Values.helmController.deployment.image.tag. In the future, if upgrading the k8s version of the cluster would change the distro built-in controller version, then an update for the PromFed helm release should be done to match.