Skip to content

io.kestra.plugin.kubernetes.PodCreate keep pods in pending state when some inputFiles are missing #211

@myvart

Description

@myvart

Describe the issue

When a io.kestra.plugin.kubernetes.PodCreate tasks has inputFiles specified and some of those files are missing (e.g the upstream task that should have create them failed for some reason), the task still creates the pod but it remains indefinitely in a pending state has the init-files container can not finish.

When the waitUntilRunning duration is reached, the task is in a Failed status and a retry can be launch (if a retry mechanism has been specified in the config) but the pod is still up in a pending state.

This issue can lead to several pods being still alive in a pending state and consuming resources and quotas despite their parent task/attempt being in a failed state.

Example:

id: debug
namespace: mynamespace

concurrency:
  limit: 1

tasks:
  - id: producer-task
    type: "io.kestra.plugin.kubernetes.PodCreate"
    namespace: mynamespace
    retry:
      type: constant
      interval: PT2M
      maxAttempt: 3
    delete: true
    resume: false
    waitForLogInterval: PT30S
    spec:
      restartPolicy: Never
      containers:
        - name: producer-task-container
          image: python:3.12-slim
          imagePullPolicy: Always
          
          command:
            - /bin/sh
            - -c
          args:
            - |
              python -c "
              import os
              import logging
              import sys
              import time
              import json

              logging.basicConfig(
                  level=logging.INFO,
              )

              logger = logging.getLogger(__name__)
              logger.setLevel(logging.INFO)

              file_path = os.path.join('results.json')

              with open(file_path, 'w') as json_file:
                json.dump({'a': 1}, json_file)

              logger.info('File results.json created')

              logger.info('sleeping')

              time.sleep(120)

              logger.info('Finish sleeping')
              "
          resources:
            requests:
              memory: "500m"
              cpu: "0.5"
            limits:
              memory: "3Gi"
              cpu: "1"

  - id: sleep
    type: io.kestra.plugin.core.flow.Sleep
    duration: "PT35S"

  - id: log-results
    type: "io.kestra.plugin.kubernetes.PodCreate"
    namespace: mynamespace
    inputFiles:
      results.json: "{{ outputs['crash-task']['outputFiles']['results.json'] }}"
    retry:
      type: constant
      interval: PT2M
      maxAttempt: 3
    delete: true
    resume: false
    waitForLogInterval: PT30S
    spec:
      restartPolicy: Never
      containers:
        - name: log-container
          image: python:3.12-slim
          imagePullPolicy: Always
          
          command:
            - /bin/sh
            - -c
          args:
            - |
              python -c "
              import os
              import logging
              import json

              logging.basicConfig(
                  level=logging.INFO,
              )
              
              logger = logging.getLogger(__name__)
              logger.setLevel(logging.INFO)

              file_path = os.path.join('{{workingDir}}', 'results.json')

              with open(file_path, 'r') as json_file:
                json_dict = json.load(json_file)

              logger.info(f'Results: {json_dict}')
              "
          resources:
            requests:
              memory: "500m"
              cpu: "0.5"
            limits:
              memory: "3Gi"
              cpu: "1"
Image Image

Environment

  • Kestra Version: develop

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/pluginPlugin-related issue or feature requestbugSomething isn't working

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions