Skip to content

feat: Airflow Listener integration #604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
May 7, 2025
Merged

Conversation

adwk67
Copy link
Member

@adwk67 adwk67 commented Apr 4, 2025

Description

fixes: #580

Jenkins tests 🟢
Openshift tests 🟢 (logging test passed when run again in isolation)
Tests after https://github.com/stackabletech/decisions/issues/51 changes: 🟢 Jenkins

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes
# Author
- [x] Changes are OpenShift compatible
- [x] CRD changes approved
- [x] CRD documentation for all fields, following the [style guide](https://docs.stackable.tech/home/nightly/contributor/docs/style-guide).
- [x] Helm chart can be installed and deployed operator works
- [x] Integration tests passed (for non trivial changes)
- [x] Changes need to be "offline" compatible
# Reviewer
- [x] Code contains useful comments
- [x] Code contains useful logging statements
- [x] (Integration-)Test cases added
- [x] Documentation added or updated. Follows the [style guide](https://docs.stackable.tech/home/nightly/contributor/docs/style-guide).
- [x] Changelog updated
- [x] Cargo.toml only contains references to git tags (not specific commits or branches)
# Acceptance
- [ ] Feature Tracker has been updated
- [ ] Proper release label has been added
- [ ] [Roadmap](https://github.com/orgs/stackabletech/projects/25/views/1) has been updated

@adwk67 adwk67 marked this pull request as ready for review April 10, 2025 13:12
@adwk67 adwk67 moved this to Development: Waiting for Review in Stackable Engineering Apr 10, 2025
@adwk67 adwk67 self-assigned this Apr 10, 2025
@maltesander maltesander self-requested a review April 16, 2025 07:53
@maltesander maltesander moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Apr 16, 2025
Copy link
Member

@maltesander maltesander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First batch, tests work fine, will do some more manual testing.

@adwk67
Copy link
Member Author

adwk67 commented Apr 28, 2025

Re-ran tests locally: 🟢

@adwk67 adwk67 requested review from maltesander and nightkr April 28, 2025 12:44
adwk67 and others added 3 commits April 29, 2025 13:03
Co-authored-by: Natalie Klestrup Röijezon <nat@nullable.se>
Co-authored-by: Natalie Klestrup Röijezon <nat@nullable.se>
Co-authored-by: Natalie Klestrup Röijezon <nat@nullable.se>
@sbernauer
Copy link
Member

Thanks @adwk67 for addressing the changes from the ADR!

I didn't look at the code, as @maltesander and @nightkr already did, but installed the operator and tried it out.
This time everything looks good to me, I would be happy to merge once the other reviewers are happy as well.

Small notes on what I checked:

Services now look good! They match the decision https://github.com/stackabletech/decisions/issues/51:

➜  airflow-operator git:(feat/integrate-listener-operator) ✗ k get svc | grep airflow                                                                                              kind-kind
airflow-scheduler-default-metrics   ClusterIP   None            <none>        9102/TCP         3m1s
airflow-webserver-default           NodePort    10.96.33.78     <none>        8080:30139/TCP   3m1s
airflow-webserver-default-metrics   ClusterIP   None            <none>        9102/TCP         3m1s
airflow-worker-default-metrics      ClusterIP   None            <none>        9102/TCP         3m1s

stacklet list looks nice, the metrics are not shown any more (which is fine I'd say):

➜  airflow-operator git:(feat/integrate-listener-operator) ✗ stackablectl stacklet list                                                                                            kind-kind

┌─────────┬─────────┬───────────┬─────────────────────────────────────────────────┬─────────────────────────────────┐
│ PRODUCT ┆ NAME    ┆ NAMESPACE ┆ ENDPOINTS                                       ┆ CONDITIONS                      │
╞═════════╪═════════╪═══════════╪═════════════════════════════════════════════════╪═════════════════════════════════╡
│ airflow ┆ airflow ┆ default   ┆ webserver-default-http  http://172.18.0.2:30139 ┆ Available, Reconciling, Running │
└─────────┴─────────┴───────────┴─────────────────────────────────────────────────┴─────────────────────────────────┘

I can also sprinkle in some more rolegroups, looks good:

➜  airflow-operator git:(feat/integrate-listener-operator) ✗ k get svc | grep airflow                                                                                              kind-kind
airflow-scheduler-default-metrics   ClusterIP   None            <none>        9102/TCP         7m55s
airflow-webserver-big               NodePort    10.96.43.47     <none>        8080:31748/TCP   15s
airflow-webserver-big-metrics       ClusterIP   None            <none>        9102/TCP         15s
airflow-webserver-default           NodePort    10.96.33.78     <none>        8080:30139/TCP   7m55s
airflow-webserver-default-metrics   ClusterIP   None            <none>        9102/TCP         7m55s
airflow-worker-default-metrics      ClusterIP   None            <none>        9102/TCP         7m55s
airflow-worker-with-gpu-metrics     ClusterIP   None            <none>        9102/TCP         15s
┌─────────┬─────────┬───────────┬─────────────────────────────────────────────────┬────────────────────────────────────────────┐
│ PRODUCT ┆ NAME    ┆ NAMESPACE ┆ ENDPOINTS                                       ┆ CONDITIONS                                 │
╞═════════╪═════════╪═══════════╪═════════════════════════════════════════════════╪════════════════════════════════════════════╡
│ airflow ┆ airflow ┆ default   ┆ webserver-big-http      http://172.18.0.2:31748 ┆ Unavailable: See [1], Reconciling, Running │
│         ┆         ┆           ┆ webserver-default-http  http://172.18.0.2:30139 ┆                                            │
└─────────┴─────────┴───────────┴─────────────────────────────────────────────────┴────────────────────────────────────────────

And the metrics services have the correct scrape label:

➜  airflow-operator git:(feat/integrate-listener-operator) ✗ k get svc -l prometheus.io/scrape=true                                                                                kind-kind
NAME                                TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
airflow-scheduler-default-metrics   ClusterIP   None         <none>        9102/TCP   17m
airflow-webserver-big-metrics       ClusterIP   None         <none>        9102/TCP   9m56s
airflow-webserver-default-metrics   ClusterIP   None         <none>        9102/TCP   17m
airflow-worker-default-metrics      ClusterIP   None         <none>        9102/TCP   17m
airflow-worker-with-gpu-metrics     ClusterIP   None         <none>        9102/TCP   9m56s

Afterwards I ran stackablectl stack in monitoring --skip-release, all metrics show up in Prometheus :)

Metrics show up, e.g.

count by (pod) (
  {job="airflow"}
)

shows all 8 Pods produce metrics:
image

Copy link
Member

@nightkr nightkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if tests pass

@adwk67
Copy link
Member Author

adwk67 commented May 7, 2025

LGTM if tests pass

Thanks - jenkins run, and a new issue to re-visit various things mentioned above: #622
Hmm....run 29 failed although the previous one 28 (done before the final merge of main) passed. Seems to be due to problems pulling images.

Ran it again on a manually-provisioned replicated k3s cluster:

--- PASS: kuttl (2246.78s)
    --- PASS: kuttl/harness (0.00s)
        --- PASS: kuttl/harness/smoke_airflow-2.10.4_openshift-false_executor-celery (233.67s)
        --- PASS: kuttl/harness/mount-dags-configmap_airflow-latest-2.10.4_openshift-false_executor-kubernetes (128.45s)
        --- PASS: kuttl/harness/mount-dags-gitsync_airflow-latest-2.10.4_openshift-false_executor-celery (508.08s)
        --- PASS: kuttl/harness/logging_airflow-2.10.4_openshift-false_executor-kubernetes (322.02s)
        --- PASS: kuttl/harness/logging_airflow-2.10.4_openshift-false_executor-celery (291.14s)
        --- PASS: kuttl/harness/opa_airflow-2.10.4_opa-latest-1.0.1_openshift-false (129.83s)
        --- PASS: kuttl/harness/external-access_airflow-2.10.4_openshift-false (209.05s)
        --- PASS: kuttl/harness/ldap_airflow-latest-2.10.4_ldap-authentication-server-verification-tls_openshift-false_executor-celery (200.36s)
        --- PASS: kuttl/harness/orphaned-resources_airflow-latest-2.10.4_openshift-false (162.89s)
        --- PASS: kuttl/harness/overrides_airflow-latest-2.10.4_openshift-false (159.94s)
        --- PASS: kuttl/harness/ldap_airflow-latest-2.10.4_ldap-authentication-server-verification-tls_openshift-false_executor-kubernetes (155.33s)
        --- PASS: kuttl/harness/cluster-operation_airflow-latest-2.10.4_openshift-false (257.11s)
        --- PASS: kuttl/harness/ldap_airflow-latest-2.10.4_ldap-authentication-insecure-tls_openshift-false_executor-kubernetes (148.40s)
        --- PASS: kuttl/harness/ldap_airflow-latest-2.10.4_ldap-authentication-no-tls_openshift-false_executor-kubernetes (150.75s)
        --- PASS: kuttl/harness/ldap_airflow-latest-2.10.4_ldap-authentication-no-tls_openshift-false_executor-celery (199.21s)
        --- PASS: kuttl/harness/ldap_airflow-latest-2.10.4_ldap-authentication-insecure-tls_openshift-false_executor-celery (195.43s)
        --- PASS: kuttl/harness/smoke_airflow-2.10.4_openshift-false_executor-kubernetes (197.03s)
        --- PASS: kuttl/harness/oidc_airflow-2.10.4_openshift-false (153.17s)
        --- PASS: kuttl/harness/mount-dags-configmap_airflow-latest-2.10.4_openshift-false_executor-celery (177.11s)
        --- PASS: kuttl/harness/resources_airflow-latest-2.10.4_openshift-false (139.18s)
        --- PASS: kuttl/harness/mount-dags-gitsync_airflow-latest-2.10.4_openshift-false_executor-kubernetes (193.90s)
PASS

@adwk67 adwk67 added this pull request to the merge queue May 7, 2025
Merged via the queue into main with commit c44e974 May 7, 2025
17 checks passed
@adwk67 adwk67 deleted the feat/integrate-listener-operator branch May 7, 2025 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate Airflow Operator with Listener Operator
4 participants