Skip to content

debug windows stdio test flakiness #1158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

felixweinberger
Copy link
Contributor

  • Add Windows debugging scripts for flaky stdio test
  • Fix pytest-xdist issue in Windows debug scripts
  • Fix xdist issue with multiple approaches
  • Simplify Windows debug scripts based on working solution

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Add PowerShell scripts to help debug test_stdio_context_manager_exiting
flakiness on Windows with Python 3.11/3.12:

- check-tee-command.ps1: Verify tee command availability
- test-stdio-flakiness-200-runs.ps1: Run test 200 times for statistics
- test-stdio-flakiness-until-failure.ps1: Run until failure occurs
- test-stdio-verbose-debug.ps1: Single run with maximum debug output
- README.md: Documentation for using the scripts
Add -n 0 flag to disable xdist parallel execution which was causing
test collection failures. Also fix successful run count calculation.
- Use PYTEST_DISABLE_PLUGIN_AUTOLOAD environment variable (recommended in CLAUDE.md)
- Use -p no:xdist flag to explicitly disable the plugin
- Add --no-cov to avoid coverage plugin issues
- Add test-stdio-simple.ps1 to try different methods of disabling xdist
- Add setup-environment.ps1 to configure Git for Windows tee in PATH
- Simplify all scripts to use working -o addopts="" approach
- Add tee availability checks to all scripts
- Update README with clear setup instructions and troubleshooting
- Remove complex xdist workarounds in favor of simple override
Based on deep analysis, the flakiness likely occurs due to:
1. Job Object handle race conditions when tests run in parallel
2. Windows handle inheritance between test processes
3. Timing sensitivity in the minimal test that just opens/closes immediately
4. Python 3.11/3.12 specific subprocess handling changes

These scripts help reproduce the CI environment conditions:
- test-stdio-parallel-flakiness.ps1: Runs with xdist parallel workers
- test-stdio-stress-race.ps1: Rapidly creates processes to expose races
This script collects detailed information about:
- Windows version and build
- Python version and subprocess configuration
- Job Objects support
- Security software
- Process creation timing
- Handle inheritance settings

This will help identify differences between local dev environments
(where the test passes) and CI environments (where it flakes).
@felixweinberger felixweinberger force-pushed the debug-windows-stdio-test-flakiness branch 4 times, most recently from aa7c2cc to 397303e Compare July 16, 2025 16:32
@felixweinberger felixweinberger force-pushed the debug-windows-stdio-test-flakiness branch from 397303e to 6be585d Compare July 16, 2025 16:36
@felixweinberger felixweinberger deleted the debug-windows-stdio-test-flakiness branch July 17, 2025 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant