Skip to content

Development notes ‐ operator‐sdk version

Scott Trent edited this page Jun 6, 2024 · 26 revisions

Random notes on developing for SusQL Operator -- recent operator sdk version

(ALWAYS UPDATING!!!!!!)

Changes to the main branch automatically rebuild and push the container image via github action, but non-main branches or forked repos need to be hand-built and pushed to a developer specific location to avoid overwriting the official images.

Sample steps to build and push bundle and container images

As desired, export CONTAINER_TOOL to docker or podman before building. (Default is docker.)

export BUNDLE_IMG="REPOSITORYURL/REPOSITORYNAME/susql-controller:v$(cat VERSION)"
export IMG=REGISTRYURL/REPOSITORYNAME/susql-controller
export IMAGE_TAG_BASE=${IMG}
export CONTAINER_TOOL=podman
podman login
make all
make bundle-build bundle-push
make operator-build operator-push

Trivial early sanity testing

make test
make run

(Use control-c to terminate make run)

Deploy to cluster

  • Log in to cluster on the command line using command from "Copy login command" on upper right corner of OpenShift web console
  • Be sure to remove previously installed SusQL operators.
  • operator-sdk cleanup susql-operator
  • operator-sdk run bundle ${BUNDLE_IMG}

Simple functional verification

cd susql-operator/test
oc create -f labelgroups.yaml
oc create -f training-job-1.yaml
oc create -f training-job-2.yaml

bash labelgroups.sh
sleep 10
bash labelgroups.sh

# remove test artifacts on completion
oc delete -f training-job-2.yaml
oc delete -f training-job-1.yaml
oc delete -f labelgroups.yaml

Troubleshooting

  • Verify configuration displayed at install and run time
  • Double check that Kepler is functioning (e.g., expected output from OpenShift->Observe->Dashboards, etc)
  • Try looking at OpenShift->Observe->Metrics searches such as:
    • kepler_container_joules_total
    • kepler_container_joules_total{container_namespace="default"}
  • Standard Kepler troubleshooting: https://sustainable-computing.io/usage/trouble_shooting/
  • Look at SusQL controller pod log output

Depending on how the operator is installed it may be in one of the following namespaces:

  • susql-operator-system, openshift-operators, or default
oc project default
oc logs $( oc get pod | grep susql-operator | cut -f 1 -d" " )
  • Verify accessibility and contents of appropriate Prometheus databases.
  • The log level can be changed by editing zapcore.Level(-2) in cmd/main.go and recreating the container image. (Eventually, log level will be configurable.)
Clone this wiki locally