Skip to content

Conversation

k0da
Copy link
Contributor

@k0da k0da commented Aug 28, 2024

In order to bootstrap cluster with fleet (to deploy CPI), we need gitjob to tolerate unitialized node taint

Refers to #2783

Tolerating unitialized nodes taints helps with bootstraping CPI with fleet. Withouth that fleet produced jobs never completes

@k0da k0da requested a review from a team as a code owner August 28, 2024 10:16
@0xavi0
Copy link
Contributor

0xavi0 commented Sep 2, 2024

Hi, thanks for your contribution!

We have a few questions about this changed proposed.
For example: why you you need to let initialised nodes? This is intentional so nodes that are not fully initialised are not eligible, so we don't really understand why do you need this.

Could you please explain a bit more?

@k0da
Copy link
Contributor Author

k0da commented Sep 2, 2024

@0xavi0 Our use case:

Our use case is to deploy Vsphere Cloud Provider with Fleet along with other applications (cert-manager,ingress-nginx, cert-manager). Those fleet.ymls are tested and well known from configuration point of view.

We're deploying RKE2 clusters with clusterAPI and don't want to configure CloudProvider credentials (as this will be solved with fleet stack). Now cluster api deploy us, 1 CP and 1 worker node, and awaits those nodes to be initialized from Cloud Provider POV. We now just need to deploy fleet and add gitrepos there. We add tolerations for fleet itself via helmchart values. But redendering mainfest is failing or rather stuck in pending (nodes are still uninialized), hence no CPI could be installed to procceed with provisioning.

node.cloudprovider.kubernetes.io/uninitialized: When the kubelet is started with an "external" cloud provider, this taint is set on a node to mark it as unusable. After a controller from the cloud-controller-manager initializes this node, the kubelet removes this taint.

In most of the cases CPI just fetches node id from cloud and updates v1.Node ProvierID field. So a node is usable for pod run. (fleet-controller + fleet-agent already running)

Not sure that clarifies the ask.

@k0da
Copy link
Contributor Author

k0da commented Sep 10, 2024

In order to bootstrap cluster with fleet (to deploy CPI), we need gitjob to tolerate unitialized node taint

Signed-off-by: Dinar Valeev <k0da@opensuse.org>
@k0da k0da force-pushed the bootstrap_tolerations branch from 88f183c to 19bb04a Compare September 10, 2024 16:52
@manno manno merged commit bba9847 into rancher:main Sep 11, 2024
12 checks passed
@kkaempf kkaempf added this to the v2.10.0 milestone Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants