Skip to content

Conversation

leosvelperez
Copy link
Member

Summary

Improved database initialization with better error handling, automatic recovery from stale files, and cache preservation during error recovery.

Changes Made

1. Iterative Retry Logic with Explicit State

Converted initialization from recursive calls to an iterative loop with a state flag (cleaned_up_stale_files) that limits retries to 2 attempts. This makes retry conditions explicit and self-documenting while preventing potential stack overflow.

2. Complete Auxiliary File Cleanup

Now removes all SQLite database files during cleanup (.db, .db-wal, .db-shm) instead of just the main .db file. This prevents stale Write-Ahead Logging auxiliary files from causing initialization failures after version upgrades.

3. Smart Compatible Database Handling

Compatible databases (matching version) are now reconfigured based on their current journal mode:

  • DELETE mode: Attempts opportunistic upgrade to WAL for better performance, cleans up any orphaned WAL files from previous runs
  • WAL mode: Verifies WAL still works; if it fails, removes only auxiliary files (preserving cache), retries, and falls back to DELETE mode if still failing
  • Unknown/query failed: Full reconfiguration with complete cleanup on failure

This preserves cached build outputs when possible while recovering from stale file issues automatically.

4. WSL1 Proactive Detection

Detects WSL1 environments by reading /proc/version and automatically uses DELETE journal mode instead of attempting WAL (which WSL1 doesn't support), preventing initialization failures.

5. Enhanced Error Messages

Replaced generic error messages with context-specific, platform-agnostic guidance:

  • Permission errors: Explains how to check file/directory permissions and ownership without platform-specific commands
  • Disk full: Suggests freeing space or moving workspace
  • Missing directories: Explains unexpected condition with troubleshooting steps
  • All errors: Include debug instructions (NX_NATIVE_LOGGING=trace) and reporting link

Error messages work consistently across Windows, macOS, and Linux without suggesting commands that may not be available on the user's platform.

6. Code Quality Improvements

Extracted helper functions to reduce duplication:

  • query_journal_mode() - Query database's current journal mode
  • set_busy_handler() - Configure connection-level busy handler
  • remove_wal_files() - Remove only WAL auxiliary files
  • remove_all_database_files() - Complete database cleanup
  • create_io_error() / create_db_error() - Generate helpful error messages

Problems Solved

Stale WAL Files After Version Upgrades

Leftover .db-wal and .db-shm files from previous versions no longer cause initialization to fail. These files are now cleaned up automatically during initialization.

Cache Loss on Recoverable Errors

When WAL mode issues occur on otherwise healthy databases, only auxiliary files are removed, preserving the main database and all cached build outputs. Full wipes only happen for genuine corruption or version mismatches.

WSL1 Compatibility

WSL1 environments now work without initialization failures by proactively detecting the environment and using DELETE mode instead of attempting WAL.

Performance Optimization

Databases in DELETE mode automatically attempt upgrading to WAL when the environment supports it (e.g., after moving workspace from network drive to local disk), providing better performance without manual intervention.

Poor Debugging Experience

Error messages now provide specific, actionable guidance based on the error type, helping users resolve issues without needing to ask for help.

Impact

  • ✅ Users no longer need manual nx reset for stale file issues
  • ✅ Cache and build outputs preserved during error recovery
  • ✅ WSL1 works out of the box
  • ✅ Automatic performance improvements when environment changes
  • ✅ Better error messages reduce support burden

Related Issue(s)

Fixes #28640
Fixes #30856
Fixes #32894

@leosvelperez leosvelperez self-assigned this Oct 13, 2025
@leosvelperez leosvelperez requested review from a team as code owners October 13, 2025 12:36
Copy link

vercel bot commented Oct 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Updated (UTC)
nx-dev Ready Ready Preview Oct 14, 2025 0:31am

Copy link

netlify bot commented Oct 13, 2025

Deploy Preview for nx-docs ready!

Name Link
🔨 Latest commit 5a5a44a
🔍 Latest deploy log https://app.netlify.com/projects/nx-docs/deploys/68ee3e97adea0d0008880e02
😎 Deploy Preview https://deploy-preview-33054--nx-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

nx-cloud bot commented Oct 13, 2025

View your CI Pipeline Execution ↗ for commit 5a5a44a

Command Status Duration Result
nx affected --targets=lint,test,test-kt,build,e... ✅ Succeeded 7m 34s View ↗
nx run-many -t check-imports check-commit check... ✅ Succeeded 2m 34s View ↗
nx-cloud record -- nx-cloud conformance:check ✅ Succeeded 3s View ↗
nx-cloud record -- nx format:check ✅ Succeeded <1s View ↗
nx-cloud record -- nx sync:check ✅ Succeeded <1s View ↗
nx documentation ✅ Succeeded 36s View ↗

☁️ Nx Cloud last updated this comment at 2025-10-14 14:21:07 UTC

trace!("Verifying WAL mode still works");

// Verify WAL mode works
match c.pragma_update(None, "journal_mode", "WAL") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit wishy-washy on this part, we'll chat over a call and look at it.

Copy link
Contributor

@nx-cloud nx-cloud bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nx Cloud has identified a possible root cause for your failed CI:

The test failure occurs during the Angular CLI project creation phase, specifically when Yarn attempts to install packages for a new Angular project. The error "Package install failed" indicates a problem with the package manager or package registry, not with the Nx database initialization code that was modified in this PR.

Analysis of the failure:

  • The failing command is @angular/cli@~20.3.0 new ng-proj... --package-manager=yarn
  • This is test setup code that runs before any Nx initialization
  • The PR only modifies SQLite database initialization in Rust code
  • No code path exists between database initialization and Angular CLI's package installation

The failure is classified as 'environment_state' because:

  1. External package manager failure (Yarn) during npm package installation
  2. No correlation between the PR's database initialization changes and Angular CLI/Yarn behavior
  3. The error occurs in test infrastructure setup, not in code being tested
  4. Likely causes: npm registry timeout, network issues, or Yarn/Corepack configuration problems in the CI environment

Similar failures on branch 33056 would help confirm this is a recurring environmental issue, but the tool returned no output, suggesting either no similar failures exist or the branch data is unavailable.

No code changes are warranted. This is an infrastructure/environment issue that needs investigation of the CI environment's package manager setup, network connectivity, or npm registry access.

A code change would likely not resolve this issue, so no action was taken.

Nx CloudView in Nx Cloud ↗


🎓 To learn more about Self Healing CI, please visit nx.dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants