Skip to content

Allow dynamic modification of io-threads num #2033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: unstable
Choose a base branch
from

Conversation

ayush933
Copy link
Contributor

@ayush933 ayush933 commented Apr 30, 2025

Item from #761

This PR has the following changes

  1. Bug fix where calling pthread_join() from main thread for an IO thread would hang indefinitely. This is because IOThreadMain() doesn't have a cancellation point.So pthread_cancel() from main thread is not honored.
    Can be reproed by calling shutdownIOThread() from the main thread for any active thread with empty job queue.
    Fixed by adding cancellation point in IOThreadMain().
  2. Makes io-threads config runtime modifiable.

@ayush933
Copy link
Contributor Author

UTs pending atm.

Copy link

codecov bot commented Apr 30, 2025

Codecov Report

Attention: Patch coverage is 96.29630% with 1 line in your changes missing coverage. Please review.

Project coverage is 71.37%. Comparing base (af25a52) to head (3f54996).
Report is 11 commits behind head on unstable.

Files with missing lines Patch % Lines
src/io_threads.c 96.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2033      +/-   ##
============================================
+ Coverage     71.02%   71.37%   +0.34%     
============================================
  Files           122      122              
  Lines         66171    66050     -121     
============================================
+ Hits          46999    47142     +143     
+ Misses        19172    18908     -264     
Files with missing lines Coverage Δ
src/config.c 78.52% <ø> (+0.12%) ⬆️
src/networking.c 87.44% <100.00%> (+0.26%) ⬆️
src/server.h 100.00% <ø> (ø)
src/io_threads.c 34.70% <96.00%> (+27.89%) ⬆️

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ayush933 ayush933 marked this pull request as ready for review April 30, 2025 20:15
Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Just minor things, looking forward to the tests.

@ayush933
Copy link
Contributor Author

ayush933 commented May 1, 2025

Hey @uriyage, just curious , can you please explain the safety concerns a bit more i.e. why we would want to drain all jobs and inactivate all threads instead of the min num required?

@ayush933 ayush933 force-pushed the dynamicThreads branch 2 times, most recently from 451df91 to cde1b5a Compare May 1, 2025 15:24
@uriyage
Copy link
Contributor

uriyage commented May 1, 2025

Hey @uriyage, just curious , can you please explain the safety concerns a bit more i.e. why we would want to drain all jobs and inactivate all threads instead of the min num required?

@ayush933 I don't recall the exact workflow, but I remember attempting something similar, and it affected the logic of dynamically adjusting the threads or the client cur_tid logic. I might be mistaken, but I think it will be cleaner and safer.

@ayush933 ayush933 force-pushed the dynamicThreads branch 2 times, most recently from 40b1d3f to c450698 Compare May 1, 2025 19:10
@ayush933 ayush933 requested review from madolson and uriyage May 1, 2025 19:13
@ayush933 ayush933 requested a review from uriyage May 5, 2025 14:33
@ayush933
Copy link
Contributor Author

ayush933 commented May 8, 2025

Hey @uriyage @madolson , anything else needed to move this PR forward?

Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits, but still would like @uriyage to take sign off as well, he knows more about this code.

@uriyage
Copy link
Contributor

uriyage commented May 11, 2025

Just some nits, but still would like @uriyage to take sign off as well, he knows more about this code.

LGTM. I left one small comment about the log verbosity

@ayush933 ayush933 force-pushed the dynamicThreads branch 2 times, most recently from 500689b to 5698821 Compare May 11, 2025 15:16
@zuiderkwast zuiderkwast added release-notes This issue should get a line item in the release notes to-be-merged Almost ready to merge labels May 16, 2025
Signed-off-by: Ayush Sharma <mrayushs933@gmail.com>
Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually approve. I didn't review the code (just looked briefly).

Bug fix where calling pthread_join() from main thread for an IO thread would hang indefinitely

What is the severity of this bug? Which scenario triggers it, from a user's perspective? Do we need to backport the bugfix? (If we need that, may need to split out the bugfix to a separate commit that we can backport, since this PRs is a new feature + a bugfix. We should not backport the new feature.)

@ayush933
Copy link
Contributor Author

Before this feature, the only time we called shutdownIoThread() was on server shutdown.
Considering this wasn't noticed until now, I don't think its too severe and needs a backport but lets wait for @uriyage's input on this.

@uriyage
Copy link
Contributor

uriyage commented May 19, 2025

Before this feature, the only time we called shutdownIoThread() was on server shutdown. Considering this wasn't noticed until now, I don't think its too severe and needs a backport but lets wait for @uriyage's input on this.

Currently, it is only used by doFastMemoryTest, which is called from printCrashReport after a crash occurs, in this case the threads are suspended by the ThreadsManager_runOnThreads so not an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes This issue should get a line item in the release notes to-be-merged Almost ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants