-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Use poll to avoid busy loops for bootstrap socket-io #1759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thanks Saif. I think I'd simplify / rework this a little bit to make it less intrusive in the main code:
So something like: Let me know what you think. I didn't test the code (it may be broken), I only wrote it to illustrate the idea. |
|
Hi @sjeaugey, thank you for the feedback. Let me iterate the pull request to address your comments. I'll name param as p.s. Pardon me for delayed response as I've been on a leave for a while. |
218f618 to
7b90715
Compare
|
Hi @sjeaugey, I have updated the PR as per your suggestion.
|
|
That patch looks pretty clean, plus the parameter to set the sleep time is nice. I don't think we'd enable it by default right away, but other than that, it seems like a good idea. |
Build and run nccl_allreduce_perf ensuring Initialization path is exercised for testing.
7b90715 to
3f5073b
Compare
|
Sounds good @sjeaugey .. Set the param |
|
Hi @sjeaugey - Wondering if you can help take a look at this patch and get it going. Happy to iterate to ensure it adheres to NCCL team's practices and expectations. |
|
Hi @saifhhasan , I'm on the NCCL team and this is on my list of tasks to take up soon. |
|
@saifhhasan, this has been merged into the v2.29 branch. Unfortunately, I did not get it approved in time for it to go through the v2.28 testing. |
Background
Bootstrap operations enable Communicator to initialize. It happens as part of startup of a Job when PG/Communicators are created.
Bootstrap offers APIs to perform I/O over Sockets (e.g. AllGathers, Barrier, Send/Recv). For large scale jobs the order of bootstrap operations would have higher skew (e.g. late joiners) as well as overall I/O (number of messages & data-size) is large.
In our large scale training, we observed that Bootstrap chokes CPU at 100% (one core per process) and lasts much longer as job scaled. This competes with the other CPU intensive resources especially during startup phase of the job.
We found this by tracing CPU for the duration of Job's initialization. And socket.cc being the highest contributor for the same.
Fix
For non-blocking Socket Read/Writes, the function
socketProgresswhich in-turn callssocketProgressOptwill return if there are no bytes to READ or WRITE. AndsocketWaitwill loop thesocketProgressagain causing the busy read/write loop.The fix is to use
pollforsocketWait, that'll wait until the socket becomes readable / writable to avoid busy loop, and also permit us to abort the operation within 1ms duration. This is similar to what the existing code do forsocketPollConnect.