Skip to content

Conversation

chris-sanders
Copy link
Owner

Summary

  • Fixed Python runtime error during container shutdown by removing sys.exit(0) from signal handler
  • Added global shutdown flag coordination to allow graceful exit without forcing immediate termination
  • Created comprehensive test suite to verify fix and prevent regression

Problem

Container shutdown was failing with:

Fatal Python error: _enter_buffered_busy: could not acquire lock for <_io.BufferedReader name='<stdin>'> 
at interpreter shutdown, possibly due to daemon threads

Solution

The race condition was caused by calling sys.exit(0) in the signal handler while logging operations were still in progress. The fix:

  1. Removed sys.exit(0) from the signal handler
  2. Added a global _shutdown_requested flag that the main loop can check
  3. Allow Python to exit naturally after cleanup completes

Test Plan

  • Created reproduction test that verifies the race condition is fixed
  • Added comprehensive signal handling integration tests
  • Added e2e container shutdown reliability tests
  • All tests pass locally
  • Code quality checks pass (black, ruff, mypy)

@chris-sanders chris-sanders merged commit e8d0a53 into main Jul 25, 2025
6 checks passed
@chris-sanders chris-sanders deleted the task/fix-container-shutdown branch July 25, 2025 04:24
chris-sanders added a commit that referenced this pull request Jul 29, 2025
- fix-container-name-conflicts.md (PR #46 merged)
- fix-asyncio-transport-python313.md (PR #42 merged)
- fix-container-shutdown-race-condition.md (PR #41 merged)
- fix-transport-cleanup-and-curl-dependency.md (PR #40 merged)
- implement-comprehensive-testing-improvements.md (PR #39 merged)
- switch-to-melange-apko-build.md (completed previously)
chris-sanders added a commit that referenced this pull request Aug 11, 2025
* Fix signal handler race condition by removing sys.exit()
* Add comprehensive integration and e2e tests for signal handling
* Fix tests and update race condition test to verify fix
* Fix e2e test stdin handling
* Fix code formatting
* Fix platform-specific exit codes in signal handling tests
* Fix formatting
* Make signal handling tests more lenient for CI environments
chris-sanders added a commit that referenced this pull request Aug 11, 2025
- fix-container-name-conflicts.md (PR #46 merged)
- fix-asyncio-transport-python313.md (PR #42 merged)
- fix-container-shutdown-race-condition.md (PR #41 merged)
- fix-transport-cleanup-and-curl-dependency.md (PR #40 merged)
- implement-comprehensive-testing-improvements.md (PR #39 merged)
- switch-to-melange-apko-build.md (completed previously)
chris-sanders added a commit that referenced this pull request Aug 11, 2025
* Fix signal handler race condition by removing sys.exit()
* Add comprehensive integration and e2e tests for signal handling
* Fix tests and update race condition test to verify fix
* Fix e2e test stdin handling
* Fix code formatting
* Fix platform-specific exit codes in signal handling tests
* Fix formatting
* Make signal handling tests more lenient for CI environments
chris-sanders added a commit that referenced this pull request Aug 11, 2025
- fix-container-name-conflicts.md (PR #46 merged)
- fix-asyncio-transport-python313.md (PR #42 merged)
- fix-container-shutdown-race-condition.md (PR #41 merged)
- fix-transport-cleanup-and-curl-dependency.md (PR #40 merged)
- implement-comprehensive-testing-improvements.md (PR #39 merged)
- switch-to-melange-apko-build.md (completed previously)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant