Skip to content

Refactor to pass parsl_config.py and workflow_config.py as a ConfigMap #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: enhancement-1-gke
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions gke_workflow/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,3 @@ COPY requirements.txt .
RUN pip install -r requirements.txt

COPY parsl_workflow.py .
COPY parsl_config.py .
COPY workflow_config.py .
95 changes: 54 additions & 41 deletions gke_workflow/README.md

Large diffs are not rendered by default.

33 changes: 0 additions & 33 deletions gke_workflow/manifests/leader_deployment.yaml

This file was deleted.

37 changes: 37 additions & 0 deletions gke_workflow/manifests/leader_job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
apiVersion: batch/v1
kind: Job
metadata:
name: viz-workflow-leader-job
namespace: viz-workflow
spec:
# Supported in k8s 1.23 - ADC has 1.22; GKE has at least 1.30
ttlSecondsAfterFinished: 60
template:
metadata:
name: viz-workflow-leader-pod
annotations:
gke-gcsfuse/volumes: "true"
spec:
restartPolicy: Never
serviceAccountName: viz-workflow-sa
containers:
- name: viz-workflow-leader-container
image: ghcr.io/permafrostdiscoverygateway/viz-workflow:0.3.5
volumeMounts:
- mountPath: /usr/local/share/app/parameters
name: viz-workflow-parameters
- mountPath: /data
name: viz-workflow-volume
workingDir: /usr/local/share/app
command:
- python
- parsl_workflow.py
volumes:
# This is how the parameters files to the Python script get to the cluster
- name: viz-workflow-parameters
configMap:
name: viz-workflow-cm
# Data storage
- name: viz-workflow-volume
persistentVolumeClaim:
claimName: viz-workflow-pvc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ def config_parsl_cluster(
max_workers = 4,
cores_per_worker = 1,
# TODO: automate this following string to pull most recent release on github?
image='ghcr.io/permafrostdiscoverygateway/viz-workflow:0.3.2',
image='ghcr.io/permafrostdiscoverygateway/viz-workflow:0.3.5',
namespace='viz-workflow'):

htex_kube = Config(
Expand All @@ -23,11 +23,11 @@ def config_parsl_cluster(
# Address for the pod worker to connect back
address = address_by_route(),
worker_debug = True,
# increased seconds allocated towards a task before it is
# increased seconds allocated towards a task before it is
# considered failed,for loading in large files:
heartbeat_threshold = 3600,
# default seconds for how often a heartbeat is sent:
heartbeat_period = 30,
heartbeat_period = 30,
provider = KubernetesProvider(

# Namespace in K8S to use for the run
Expand Down
11 changes: 5 additions & 6 deletions gke_workflow/parsl_workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@

import pdgstaging
import pdgraster
import workflow_config

import json
import logging
Expand All @@ -27,7 +26,9 @@
from parsl.providers import KubernetesProvider
from parsl.addresses import address_by_route
# from kubernetes import client, config
from parsl_config import config_parsl_cluster

from parameters.parsl_config import config_parsl_cluster
from parameters.workflow_config import workflow_config


# call parsl config and initiate k8s cluster
Expand All @@ -36,8 +37,6 @@
htex_kube = config_parsl_cluster()
parsl.load(htex_kube)

workflow_config = workflow_config.workflow_config


def run_pdg_workflow(
workflow_config,
Expand All @@ -59,8 +58,8 @@ def run_pdg_workflow(
How many input files should be included in a single creation
task? (each task is run in parallel) Default: 1
batch_size_tiles: int
How many staged tiles, geotiff tiles, or web tiles should be
included in a single creation task?
How many staged tiles, geotiff tiles, or web tiles should be
included in a single creation task?
(each task is run in parallel) Default: 300
"""

Expand Down
12 changes: 12 additions & 0 deletions gke_workflow/run-remote-job.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env bash
set -euo pipefail

# Send our parameters files to the cluster. This will update the ConfigMap if
# there are any changes. Note that "age" represents the time since the
# ConfigMap was created, not since it was last updated.
kubectl create configmap viz-workflow-cm --from-file parameters/ \
-o yaml -n viz-workflow --dry-run=client \
| kubectl apply -f -

# Submit a "Job" to the cluster which runs our script
kubectl apply -f manifests/leader_job.yaml