This repository was archived by the owner on Jan 29, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 45
Add Cluster API deployment method for TAS #108
Open
criscola
wants to merge
23
commits into
intel:master
Choose a base branch
from
criscola:feature/cluster-api
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
b5846ec
Add namespace to TAS Service Account
criscola c890226
Add Cluster API deployment method
criscola 4d7d6df
Adding code_of_conduct and contributing readme file
madalazar d6f904b
Merge branch 'master' into feature/cluster-api
criscola 050de7f
Merge branch 'master' into feature/cluster-api
criscola 19026c4
Add Docker CAPI deployment specific guide
criscola f44598a
Add ClusterResourceSets for CAPD deployment
criscola 6be7648
Move CRS to 'shared' folder.
criscola fd030d4
Update link to Health Metric Example.
criscola 3badb05
Rename your-manifests.yaml to capi-quickstart.yaml
criscola 57ff014
Fix numbering in markdown.
criscola 1bb8999
Add yaml newlines.
criscola e41d190
Add testing/development notice in all markdowns.
criscola fb752e1
Move generic/docker provider links to top.
criscola eaf3e7c
Add Docker and Kind versions.
criscola 2fe40df
Add small comment after clusterctl generate.
criscola 1365e59
Add necessary feature flags.
criscola d3cd12c
Update paths of commands referencing the Helm chart.
criscola 0891f21
Add yq commands to wrangle with the various resources with the comman…
criscola 2d08d1e
Reformat docs.
criscola 299571e
Add a few more links to files/folders.
criscola ed3d300
Add note on how to initialize Kind cluster in Docker provider.
criscola 8f98dd6
More adjustments.
criscola File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Cluster API deployment | ||
|
||
** This guide is meant for local testing/development only, this is not meant for production usage.** | ||
|
||
## Introduction | ||
|
||
Cluster API is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters. [Learn more](https://cluster-api.sigs.k8s.io/introduction.html). | ||
|
||
This folder contains an automated and declarative way of deploying the Telemetry Aware Scheduler using Cluster API. We will make use of the [ClusterResourceSet feature](https://cluster-api.sigs.k8s.io/tasks/experimental-features/cluster-resource-set.html) to automatically apply a set of resources. Note you must enable its feature gate before running `clusterctl init` (with `export EXP_CLUSTER_RESOURCE_SET=true`). | ||
|
||
## Guides | ||
|
||
- [Cluster API deployment - Docker provider (for local testing/development only)](docker/capi-docker.md) | ||
- [Cluster API deployment - Generic provider](generic/capi.md) | ||
|
||
## Testing | ||
|
||
You can test if the scheduler actually works by following this guide: | ||
[Health Metric Example](https://github.com/intel/platform-aware-scheduling/blob/master/telemetry-aware-scheduling/docs/health-metric-example.md) |
224 changes: 224 additions & 0 deletions
224
telemetry-aware-scheduling/deploy/cluster-api/docker/capi-docker.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,224 @@ | ||
# Cluster API deployment - Docker provider (for local testing/development only) | ||
|
||
**This guide is meant for local testing/development only, this is not meant for production usage.** | ||
|
||
For the deployment using a generic provider, please refer to [Cluster API deployment - Generic provider](capi.md). | ||
|
||
## Requirements | ||
|
||
- Run Kubernetes v1.22 or greater (tested on Kubernetes v1.25). | ||
- Docker (tested on Docker version 20.10.22) | ||
- Kind (tested on Kind version 0.17.0) | ||
|
||
## Provision clusters with TAS installed using Cluster API | ||
|
||
We will provision a KinD cluster with the TAS installed using Cluster API. | ||
|
||
1. Run the following to set up a KinD cluster for CAPD: | ||
|
||
```bash | ||
cat > kind-cluster-with-extramounts.yaml <<EOF | ||
kind: Cluster | ||
apiVersion: kind.x-k8s.io/v1alpha4 | ||
nodes: | ||
- role: control-plane | ||
extraMounts: | ||
- hostPath: /var/run/docker.sock | ||
containerPath: /var/run/docker.sock | ||
EOF | ||
``` | ||
|
||
2. Enable the `CLUSTER_TOPOLOGY` and `EXP_CLUSTER_RESOURCE_SET` feature gates: | ||
|
||
```bash | ||
export CLUSTER_TOPOLOGY=true | ||
export EXP_CLUSTER_RESOURCE_SET=true | ||
``` | ||
|
||
3. Initialize the management cluster: | ||
|
||
Note to start the Kind cluster, you will need to run the following command. See also [Cluster API Quickstart](https://cluster-api.sigs.k8s.io/user/quick-start.html): | ||
|
||
```bash | ||
kind create cluster --config kind-cluster-with-extramounts.yaml | ||
``` | ||
|
||
then, to initialize the Docker provider: | ||
|
||
```bash | ||
clusterctl init --infrastructure docker | ||
``` | ||
|
||
Run the following to generate the default cluster manifests: | ||
madalazar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```bash | ||
clusterctl generate cluster capi-quickstart --flavor development \ | ||
--kubernetes-version v1.25.0 \ | ||
--control-plane-machine-count=1 \ | ||
--worker-machine-count=3 \ | ||
> capi-quickstart.yaml | ||
madalazar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
madalazar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
If Kind was running correctly, and the Docker provider was initialized with the previous command, the command will return nothing to indicate success. | ||
|
||
4. Merge the contents of the resources provided in [../shared/cluster-patch.yaml](../shared/cluster-patch.yaml), [kubeadmcontrolplanetemplate-patch.yaml](kubeadmcontrolplanetemplate-patch.yaml) and [clusterclass-patch.yaml](clusterclass-patch.yaml) with | ||
the resources contained in your newly generated `capi-quickstart.yaml`. | ||
|
||
The new config will: | ||
- Configure TLS certificates for the extender | ||
- Change the `dnsPolicy` of the scheduler to `ClusterFirstWithHostNet` | ||
- Place `KubeSchedulerConfiguration` into control plane nodes and pass the relative CLI flag to the scheduler. | ||
- Change the behavior of the pre-existing patch application of `/spec/template/spec/kubeadmConfigSpec/files` in `ClusterClass` | ||
such that our new patch is not ignored/overwritten. For some more clarification on this, see [this issue](https://github.com/kubernetes-sigs/cluster-api/pull/7630). | ||
- Add the necessary labels for ClusterResourceSet to take effect in the workload cluster. | ||
|
||
Therefore, we will: | ||
- Merge the contents of file [kubeadmcontrolplanetemplate-patch.yaml](kubeadmcontrolplanetemplate-patch.yaml) into the KubeadmControlPlaneTemplate resource of capi-quickstart.yaml. | ||
- Replace entirely the KubeadmControlPlaneTemplate patch item with `path` `/spec/template/spec/kubeadmConfigSpec/files` with the item present in file `clusterclass-patch.yaml`. | ||
- Add the necessary labels to the Cluster resource of `capi-quickstart.yaml`. | ||
|
||
To do this, we provide some quick `yq` commands to automate the process, but you can also merge the files manually. | ||
|
||
Patch the KubeadmControlPlaneTemplate resource by merging the contents of [kubeadmcontrolplanetemplate-patch.yaml](kubeadmcontrolplanetemplate-patch.yaml) with the one contained in `capi-quickstart.yaml`: | ||
```bash | ||
# Extract KubeadmControlPlaneTemplate | ||
yq e '. | select(.kind == "KubeadmControlPlaneTemplate")' capi-quickstart.yaml > kubeadmcontrolplanetemplate.yaml | ||
# Merge patch | ||
yq eval-all '. as $item ireduce ({}; . *+ $item)' kubeadmcontrolplanetemplate.yaml kubeadmcontrolplanetemplate-patch.yaml > final-kubeadmcontrolplanetemplate.yaml | ||
# Replace the original KubeadmControlPlaneTemplate with the patched one | ||
export KCPT_FINAL=$(<final-kubeadmcontrolplanetemplate.yaml) | ||
yq -i '. | select(.kind == "KubeadmControlPlaneTemplate") = env(KCPT_FINAL)' capi-quickstart.yaml | ||
``` | ||
|
||
Modify the ClusterClass patches to allow our patch to be applied: | ||
|
||
```bash | ||
# Extract ClusterClass | ||
yq e '. | select(.kind == "ClusterClass")' capi-quickstart.yaml > clusterclass.yaml | ||
export CC_PATCH=$(<clusterclass-patch.yaml) | ||
# Replace the original ClusterClass patch with the new one | ||
yq '(.spec.patches[].definitions[].jsonPatches[] | select(.path == "/spec/template/spec/kubeadmConfigSpec/files")) = env(CC_PATCH)' clusterclass.yaml > final-clusterclass.yaml | ||
# Replace the ClusterClass in capi-quickstart.yaml with the new one | ||
export CC_FINAL=$(<final-clusterclass.yaml) | ||
yq -i '. | select(.kind == "ClusterClass") = env(CC_FINAL)' capi-quickstart.yaml | ||
``` | ||
|
||
Add the necessary labels to the Cluster resource: | ||
|
||
```bash | ||
# Extract Cluster | ||
yq e '. | select(.kind == "Cluster")' capi-quickstart.yaml > cluster.yaml | ||
yq eval-all '. as $item ireduce ({}; . *+ $item)' cluster.yaml ../shared/cluster-patch.yaml > final-cluster.yaml | ||
export C_FINAL=$(<final-cluster.yaml) | ||
yq -i '. | select(.kind == "Cluster") = env(C_FINAL)' capi-quickstart.yaml | ||
``` | ||
|
||
you should end up with something like [this](sample-capi-manifests.yaml). | ||
|
||
5. You will need to prepare the Helm Charts of the various components and join the TAS manifests together for convenience: | ||
|
||
First, under [telemetry-aware-scheduling/deploy/charts](../../../deploy/charts) tweak the charts if you need (e.g. | ||
additional metric scraping configurations), then render the charts: | ||
|
||
```bash | ||
helm template ../../charts/prometheus_node_exporter_helm_chart/ > prometheus-node-exporter.yaml | ||
helm template ../../charts/prometheus_helm_chart/ > prometheus.yaml | ||
helm template ../../charts/prometheus_custom_metrics_helm_chart > prometheus-custom-metrics.yaml | ||
``` | ||
|
||
You need to add namespaces resources, else resource application will fail. Prepend the following to `prometheus.yaml`: | ||
|
||
```bash | ||
kind: Namespace | ||
apiVersion: v1 | ||
metadata: | ||
name: monitoring | ||
labels: | ||
name: monitoring | ||
```` | ||
|
||
Prepend the following to `prometheus-custom-metrics.yaml`: | ||
```bash | ||
kind: Namespace | ||
apiVersion: v1 | ||
metadata: | ||
name: custom-metrics | ||
labels: | ||
name: custom-metrics | ||
``` | ||
|
||
The custom metrics adapter and the TAS deployment require TLS to be configured with a certificate and key. | ||
Information on how to generate correctly signed certs in kubernetes can be found [here](https://github.com/kubernetes-sigs/apiserver-builder-alpha/blob/master/docs/concepts/auth.md). | ||
Files `serving-ca.crt` and `serving-ca.key` should be in the current working directory. | ||
|
||
Run the following: | ||
|
||
```bash | ||
kubectl -n custom-metrics create secret tls cm-adapter-serving-certs --cert=serving-ca.crt --key=serving-ca.key -oyaml --dry-run=client > custom-metrics-tls-secret.yaml | ||
kubectl -n default create secret tls extender-secret --cert=serving-ca.crt --key=serving-ca.key -oyaml --dry-run=client > tas-tls-secret.yaml | ||
``` | ||
|
||
**Attention: Don't commit the TLS certificate and private key to any Git repo as it is considered bad security practice! Make sure to wipe them off your workstation after applying the relative Secrets to your cluster.** | ||
|
||
You also need the TAS manifests (Deployment, Policy CRD and RBAC accounts) and the extender's "configmapgetter" | ||
ClusterRole. We will join the TAS manifests together, so we can have a single ConfigMap for convenience: | ||
|
||
```bash | ||
yq '.' ../../tas-*.yaml > tas.yaml | ||
``` | ||
|
||
6. Create and apply the ConfigMaps | ||
|
||
```bash | ||
kubectl create configmap custom-metrics-tls-secret-configmap --from-file=./custom-metrics-tls-secret.yaml -o yaml --dry-run=client > custom-metrics-tls-secret-configmap.yaml | ||
kubectl create configmap custom-metrics-configmap --from-file=./prometheus-custom-metrics.yaml -o yaml --dry-run=client > custom-metrics-configmap.yaml | ||
kubectl create configmap prometheus-configmap --from-file=./prometheus.yaml -o yaml --dry-run=client > prometheus-configmap.yaml | ||
kubectl create configmap prometheus-node-exporter-configmap --from-file=./prometheus-node-exporter.yaml -o yaml --dry-run=client > prometheus-node-exporter-configmap.yaml | ||
kubectl create configmap tas-configmap --from-file=./tas.yaml -o yaml --dry-run=client > tas-configmap.yaml | ||
kubectl create configmap tas-tls-secret-configmap --from-file=./tas-tls-secret.yaml -o yaml --dry-run=client > tas-tls-secret-configmap.yaml | ||
kubectl create configmap extender-configmap --from-file=../../extender-configuration/configmap-getter.yaml -o yaml --dry-run=client > extender-configmap.yaml | ||
kubectl create configmap calico-configmap --from-file=../shared/calico-configmap.yaml -o yaml --dry-run=client > calico-configmap.yaml | ||
``` | ||
|
||
Apply to the management cluster: | ||
|
||
```bash | ||
kubectl apply -f '*-configmap.yaml' | ||
``` | ||
|
||
7. Apply the ClusterResourceSets | ||
|
||
ClusterResourceSets resources are already given to you in [../shared/clusterresourcesets.yaml](../shared/clusterresourcesets.yaml). | ||
Apply them to the management cluster with `kubectl apply -f ../shared/clusterresourcesets.yaml` | ||
|
||
8. Apply the cluster manifests | ||
|
||
Finally, you can apply your manifests: | ||
|
||
```bash | ||
kubectl apply -f capi-quickstart.yaml | ||
``` | ||
|
||
Wait until the cluster is fully initialized. You can use the following command to check its status (it should take a few minutes). | ||
Note that both `INITIALIZED` and `API SERVER AVAILABLE` should be set to true: | ||
|
||
```bash | ||
watch -n 1 kubectl get kubeadmcontrolplane | ||
``` | ||
|
||
The Telemetry Aware Scheduler will be running on your new cluster. | ||
|
||
You can connect to the workload cluster by exporting its kubeconfig: | ||
|
||
```bash | ||
clusterctl get kubeconfig capi-quickstart > capi-quickstart.kubeconfig | ||
``` | ||
|
||
Then, specifically for the CAPD provider, point the kubeconfig to the correct address of the HAProxy container: | ||
|
||
```bash | ||
sed -i -e "s/server:.*/server: https:\/\/$(docker port capi-quickstart-lb 6443/tcp | sed "s/0.0.0.0/127.0.0.1/")/g" ./capi-quickstart.kubeconfig | ||
``` | ||
|
||
You can test if the scheduler actually works by following this guide: | ||
[Health Metric Example](https://github.com/intel/platform-aware-scheduling/blob/master/telemetry-aware-scheduling/docs/health-metric-example.md) |
6 changes: 6 additions & 0 deletions
6
telemetry-aware-scheduling/deploy/cluster-api/docker/cluster-patch.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
apiVersion: cluster.x-k8s.io/v1beta1 | ||
kind: Cluster | ||
metadata: | ||
labels: | ||
scheduler: tas | ||
cni: calico |
24 changes: 24 additions & 0 deletions
24
telemetry-aware-scheduling/deploy/cluster-api/docker/clusterclass-patch.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
op: add | ||
path: /spec/template/spec/kubeadmConfigSpec/files/- | ||
valueFrom: | ||
template: | | ||
content: | | ||
apiVersion: apiserver.config.k8s.io/v1 | ||
kind: AdmissionConfiguration | ||
plugins: | ||
- name: PodSecurity | ||
configuration: | ||
apiVersion: pod-security.admission.config.k8s.io/v1beta1 | ||
kind: PodSecurityConfiguration | ||
defaults: | ||
enforce: "{{ .podSecurityStandard.enforce }}" | ||
enforce-version: "latest" | ||
audit: "{{ .podSecurityStandard.audit }}" | ||
audit-version: "latest" | ||
warn: "{{ .podSecurityStandard.warn }}" | ||
warn-version: "latest" | ||
exemptions: | ||
usernames: [] | ||
runtimeClasses: [] | ||
namespaces: [kube-system] | ||
path: /etc/kubernetes/kube-apiserver-admission-pss.yaml |
95 changes: 95 additions & 0 deletions
95
telemetry-aware-scheduling/deploy/cluster-api/docker/clusterresourcesets.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: prometheus | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
scheduler: tas | ||
resources: | ||
- kind: ConfigMap | ||
name: prometheus-configmap | ||
--- | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: prometheus-node-exporter | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
scheduler: tas | ||
resources: | ||
- kind: ConfigMap | ||
name: prometheus-node-exporter-configmap | ||
--- | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: custom-metrics | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
scheduler: tas | ||
resources: | ||
- kind: ConfigMap | ||
name: custom-metrics-configmap | ||
--- | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: custom-metrics-tls-secret | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
scheduler: tas | ||
resources: | ||
- kind: ConfigMap | ||
name: custom-metrics-tls-secret-configmap | ||
--- | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: tas | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
scheduler: tas | ||
resources: | ||
- kind: ConfigMap | ||
name: tas-configmap | ||
--- | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: tas-tls-secret | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
scheduler: tas | ||
resources: | ||
- kind: ConfigMap | ||
name: tas-tls-secret-configmap | ||
--- | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: extender | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
scheduler: tas | ||
resources: | ||
- kind: ConfigMap | ||
name: extender-configmap | ||
--- | ||
apiVersion: addons.cluster.x-k8s.io/v1alpha3 | ||
kind: ClusterResourceSet | ||
metadata: | ||
name: calico | ||
spec: | ||
clusterSelector: | ||
matchLabels: | ||
cni: calico | ||
resources: | ||
- kind: ConfigMap | ||
name: calico-configmap |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.