-
Notifications
You must be signed in to change notification settings - Fork 478
Helm Chart Templates For Katib #2553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: kunal-511 <yoyokvunal@gmail.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
cc: @juliusvonkohout |
Signed-off-by: kunal-511 <yoyokvunal@gmail.com>
@kunal-511: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
Signed-off-by: kunal-511 <yoyokvunal@gmail.com>
/retest |
@juliusvonkohout should this PR be linked to kubeflow/community#832 ? cc @kunal-511 |
@kunal-511 let us know once this PR is ready for review. It seems there's a hold + two checks failing. Additionally, can you add some description and comments on the PR, we have 58 files to review and it'll be great to have more comments on it to help us with the approval. Another thing I think it will be great if we can involve the Katib working group on this review. |
{{- end }} | ||
ports: | ||
- name: postgres | ||
containerPort: 5432 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my opinion is that ports should be configurable unless it's hard-coded somewhere in katib
- "/bin/bash" | ||
- "-c" | ||
- "mysqladmin ping -u root -p${MYSQL_ROOT_PASSWORD}" | ||
periodSeconds: 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this configurations should be customizable it could depend on the cluster that it's running and resources available.
initialDelaySeconds - periodSeconds - failureThreshold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Electronic-Waste to add thoughts
args: | ||
- --katib-config=/katib-config.yaml | ||
ports: | ||
- containerPort: 8443 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in my opinion ports should be added in the values files unless it's hardcoded somewhere in katib.
Signed-off-by: kunal-511 <yoyokvunal@gmail.com>
Signed-off-by: kunal-511 <yoyokvunal@gmail.com>
/ok-to-test |
Signed-off-by: kunal-511 <yoyokvunal@gmail.com>
Signed-off-by: kunal-511 <yoyokvunal@gmail.com>
The tree structure of the Kubeflow Katib Helm chart directory with explanations for each file and folder.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kunal-511 nice job! I've reviewed the standalone install including namespace and controller folder.
I'll continue with the rest during this week. If you can take a look at this comments it'll be great. thank you
cc @juliusvonkohout
labels: | ||
{{- include "katib.labels" . | nindent 4 }} | ||
katib.kubeflow.org/metrics-collector-injection: enabled | ||
{{- if .Values.podSecurityStandards.enforced }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this code, pod security Standards on katib namespace.yaml file. is this coming from the implementation from manifests?
Ensure we are validating any changes to the current katib manifests with the katib team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is Optional security hardening in helm charts which can be easily disable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kunal-511 great security practice. I'd like the katib team to be aware of this updates
cc @Electronic-Waste to approve.
- https://github.com/kubeflow/katib | ||
|
||
annotations: | ||
category: Machine Learning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate this category with the katib team
@@ -0,0 +1,123 @@ | |||
{{- if .Values.controller.enabled }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep the same file name as it's on the katib manifests, controller.yaml
{{- end }} | ||
annotations: | ||
prometheus.io/scrape: "true" | ||
prometheus.io/port: "{{ .Values.controller.ports.metrics }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find this value on the values-standale.yaml file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Values.yaml is the main file
{{- end }} | ||
containers: | ||
- name: katib-controller | ||
image: {{ include "katib.image" (dict "registry" .Values.global.imageRegistry "repository" .Values.controller.image.repository "tag" (.Values.controller.image.tag | default .Values.global.imageTag) "global" .Values.global) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we define a default value on the values file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already present in the values.yaml file
- containerPort: {{ .Values.controller.ports.healthz }} | ||
name: {{ .Values.controller.service.ports.healthz.name }} | ||
protocol: TCP | ||
readinessProbe: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep the probes here. I do not know of a reason of why this would change unless the implementation changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Electronic-Waste to provide thoughts
{{- toYaml . | nindent 12 }} | ||
{{- end }} | ||
{{- with .Values.controller.securityContext }} | ||
securityContext: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line duplicated and content is also in helpers. I recommend to keep security best practices on the deployment file.
args: | ||
{{- toYaml .Values.controller.command.args | nindent 12 }} | ||
ports: | ||
- containerPort: {{ .Values.controller.ports.webhook }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find the ports on the values standalone file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Values.yaml is the main file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I couldn't find it before. I see now. thanks.
spec: | ||
{{- include "katib.serviceType" (dict "type" .Values.controller.service.type "Values" .Values) | nindent 2 }} | ||
ports: | ||
- port: {{ .Values.controller.service.ports.webhook.port }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is in the values.yaml
@@ -0,0 +1,37 @@ | |||
{{- if .Values.controller.leaderElection.enabled }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems this is a new file. can you validate with the katib team? or let me know if I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Helm chart integrated this as a configurable option rather than requiring a separate installation profile. In the original manifests, you'd need to choose between:
katib-standalone (no leader election)
katib-leader-election (with leader election)
The Helm chart makes this toggleable through controller.leaderElection.enabled, providing more flexibility in a single chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the clarification. I'd refer to the katib team on this.
@Electronic-Waste
spec: | ||
{{- include "katib.serviceType" (dict "type" .Values.controller.service.type "Values" .Values) | nindent 2 }} | ||
ports: | ||
- port: {{ .Values.controller.service.ports.webhook.port }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is in the values.yaml
can you help here ? |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a new implementation. thanks!
@Electronic-Waste to review.
{{- end }} | ||
resources: | ||
requests: | ||
storage: {{ .Values.database.postgres.persistence.size | quote }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values file shows 10Gi, original file on manifests is 3Gi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ci values override all the main file values. So
In charts/katib/ci/values-postgres.yaml
(lines 35-37):
persistence:
enabled: true
size: 3Gi # ← This matches the manifests!
/ok-to-test |
great job @kunal-511 thank so much for working on this. There's a lot of work and thought process on understanding the current katib kustomize files, installation and helm . I appreciate the amazing work you have done. We need @Electronic-Waste to do a final review. @Electronic-Waste I left a few comments for you, please review ci in detail and there are good practices implemented from security constraints, PDB, network policies and service monitoring ensure you are ok with this implementation. |
What this PR does / why we need it:
Helm Templates For Katib
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Checklist: