Skip to content

Race condition on safeguards #8

@je-al

Description

@je-al

Hey, seems like the safeguard control gets stuck in the after_experiment_control when exit_gracefully is called before reaching the call to wait on the now_all_done Barrier. I can reliably get the following experiment to hang forever unless I actually configure a pause for the probe, but I'm guessing there could be a more elegant solution:

---
title: safeguard test
description: safeguard test

controls:
- name: safeguard
  provider:
    type: python
    module: chaosaddons.controls.safeguards
    arguments:
      probes:
        - name: safeguard
          type: probe
          provider:
            type: process
            path: date
          tolerance: 1
          # pauses:
          #   after: 1
[2022-06-22 17:54:48 DEBUG] [cli:113] Running command 'run'
[2022-06-22 17:54:48 DEBUG] [cli:117] Using settings file '/Users/jeal/.chaostoolkit/settings.yaml'
[2022-06-22 17:54:49 DEBUG] [__init__:399] No controls to apply on 'loader'
[2022-06-22 17:54:49 DEBUG] [__init__:399] No controls to apply on 'loader'
[2022-06-22 17:54:49 DEBUG] [caching:24] Building activity cache...
[2022-06-22 17:54:49 DEBUG] [caching:35] Cached 3 activities
[2022-06-22 17:54:49 INFO] [experiment:58] Validating the experiment's syntax
[2022-06-22 17:54:49 DEBUG] [configuration:63] Loading configuration...
[2022-06-22 17:54:49 DEBUG] [secret:78] Loading secrets...
[2022-06-22 17:54:49 DEBUG] [secret:104] Done loading secrets
[2022-06-22 17:54:49 DEBUG] [python:196] Control 'validate_control' loaded from '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py'
[2022-06-22 17:54:49 DEBUG] [python:192] Control module '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaostoolkit_rappi-0.1.0-py3.9.egg/chaos_rappi/controls/state_sharing.py' does not declare 'validate_control'
[2022-06-22 17:54:49 INFO] [experiment:109] Experiment looks valid
[2022-06-22 17:54:49 DEBUG] [caching:42] Clearing activities cache
[2022-06-22 17:54:49 DEBUG] [caching:24] Building activity cache...
[2022-06-22 17:54:49 DEBUG] [caching:35] Cached 3 activities
[2022-06-22 17:54:49 DEBUG] [configuration:63] Loading configuration...
[2022-06-22 17:54:49 DEBUG] [secret:78] Loading secrets...
[2022-06-22 17:54:49 DEBUG] [secret:104] Done loading secrets
[2022-06-22 17:54:49 DEBUG] [configuration:155] Loading dynamic configuration...
[2022-06-22 17:54:49 INFO] [run:320] Running experiment: journal output test
[2022-06-22 17:54:49 DEBUG] [__init__:52] Initializing controls
[2022-06-22 17:54:49 DEBUG] [__init__:61] Initializing control 'safeguard'
[2022-06-22 17:54:49 DEBUG] [python:196] Control 'configure_control' loaded from '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py'
[2022-06-22 17:54:49 DEBUG] [__init__:61] Initializing control 'get metadata'
[2022-06-22 17:54:49 DEBUG] [python:192] Control module '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaostoolkit_rappi-0.1.0-py3.9.egg/chaos_rappi/controls/state_sharing.py' does not declare 'configure_control'
[2022-06-22 17:54:49 INFO] [run:344] Steady-state strategy: default
[2022-06-22 17:54:49 INFO] [run:348] Rollbacks strategy: default
[2022-06-22 17:54:49 INFO] [run:353] No steady state hypothesis defined. That's ok, just exploring.
[2022-06-22 17:54:49 DEBUG] [__init__:409] Applying before-control 'safeguard' on 'experiment'
[2022-06-22 17:54:49 DEBUG] [python:196] Control 'before_experiment_control' loaded from '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py'
[2022-06-22 17:54:49 DEBUG] [__init__:409] Applying before-control 'safeguard' on 'activity'
[2022-06-22 17:54:49 DEBUG] [python:192] Control module '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py' does not declare 'before_activity_control'
[2022-06-22 17:54:49 DEBUG] [process:52] Running: ['/bin/date']
[2022-06-22 17:54:49 DEBUG] [__init__:409] Applying after-control 'safeguard' on 'activity'
[2022-06-22 17:54:49 DEBUG] [python:192] Control module '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py' does not declare 'after_activity_control'
[2022-06-22 17:54:49 CRITICAL] [safeguards:290] Safeguard 'safeguard' triggered the end of the experiment
[2022-06-22 17:54:49 INFO] [run:607] Playing your experiment's method now...
[2022-06-22 17:54:49 DEBUG] [safeguards:198] Safeguard 'safeguard' finished normally
[2022-06-22 17:54:49 WARNING] [run:420] Received the exit signal: 20
[2022-06-22 17:54:49 INFO] [run:458] Experiment ended with status: interrupted
[2022-06-22 17:54:49 DEBUG] [__init__:409] Applying after-control 'safeguard' on 'experiment'
[2022-06-22 17:54:49 DEBUG] [python:196] Control 'after_experiment_control' loaded from '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py'
^C[2022-06-22 17:54:53 DEBUG] [__init__:91] Cleaning up controls
[2022-06-22 17:54:53 DEBUG] [__init__:100] Cleaning up control 'safeguard'
[2022-06-22 17:54:53 DEBUG] [python:192] Control module '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaosaddons/controls/safeguards.py' does not declare 'cleanup_control'
[2022-06-22 17:54:53 DEBUG] [__init__:100] Cleaning up control 'get metadata'
[2022-06-22 17:54:53 DEBUG] [python:192] Control module '/Users/jeal/.pyenv/versions/3.9.9/envs/chaostoolkit/lib/python3.9/site-packages/chaostoolkit_rappi-0.1.0-py3.9.egg/chaos_rappi/controls/state_sharing.py' does not declare 'cleanup_control'
[2022-06-22 17:54:53 DEBUG] [caching:42] Clearing activities cache

Aborted!

p.s.: I'm running Python 3.9.9 (from homebrew) on macOS 12.4, though I don't think it has anything to do with it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions