-
Notifications
You must be signed in to change notification settings - Fork 12
Wait for socket to be ready before connecting on 115 #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add isSocketHealthy() method that checks if a socket is still usable by peeking at the socket state - Detects when remote end has closed the connection - Detects other socket errors that would cause issues on reuse - Modify sendRawRequest() to validate socket health before attempting reuse - Automatically closes stale sockets and opens new connections - This should significantly reduce 'Error 11: Resource temporarily unavailable' errors
- Track socket creation time, last used time, and request count - Log socket age (seconds since creation) when reusing or detecting stale sockets - Log socket idle time (seconds since last use) for better debugging - Log number of requests made on each socket connection - Reset metrics when socket is closed (on failure, Connection: close, or stale detection) - Provides visibility into socket lifecycle and helps identify patterns in EAGAIN errors
This reverts commit 8b4d1c3.
- Set socket to non-blocking mode immediately after creation for proper PHP 8 Socket object behavior - Add proper socket_select() wait after EINPROGRESS to ensure socket is ready for writing - Increase connection timeout from 1s to 5s for remote datacenter connections - Add detailed timing logs to track connection establishment performance - Return socket to blocking mode after connection established for normal I/O operations This addresses intermittent 'Resource temporarily unavailable' errors that occur when connecting to remote Bedrock servers, particularly in PHP 8 environments where Socket objects have different timing behavior than legacy resources
src/Client.php
Outdated
| } | ||
|
|
||
| // Configure this socket and try to connect to it | ||
| // PHP 8 Socket objects need explicit non-blocking mode to prevent EINPROGRESS timeouts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment confuses me very much (like this problem as a whole) 😅
- Isn't our intention to use blocking sockets?
- The next confused engineer will read "PHP 8 Socket objects need explicit non-blocking mode to prevent EINPROGRESS timeouts" and think: so why are we handling EINPROGRESS below?
src/Client.php
Outdated
| $this->logger->info('Bedrock\Client - socket_connect returned error 115, continuing.'); | ||
| $this->logger->info('Bedrock\Client - socket_connect returned error 115, waiting for connection to complete.', [ | ||
| 'host' => $host, | ||
| 'connect_attempt_time_ms' => round($connectTime, 3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update to lowerCamelCase
src/Client.php
Outdated
| $this->logger->info('Bedrock\Client - socket_connect returned error 115, waiting for connection to complete.', [ | ||
| 'host' => $host, | ||
| 'connect_attempt_time_ms' => round($connectTime, 3), | ||
| 'pid' => getmypid(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we logging pid here? 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was originally looking to see if this could be tied to php-fpm workers shutting down after 1000 requests, but that did not pan out. I will remove it.
src/Client.php
Outdated
| throw new ConnectionFailure("Socket had error after EINPROGRESS for $host:$port. Error: $socketErrorCode $socketError"); | ||
| } | ||
|
|
||
| $selectTime = (microtime(true) - $connectStart) * 1000; // Total time from connect to ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect selectTime to time socket_select, not socket_select & socket_connect
src/Client.php
Outdated
|
|
||
| $this->logger->info('Bedrock\Client - Socket ready for writing after EINPROGRESS.', [ | ||
| 'host' => $host, | ||
| 'total_connection_time_ms' => round($selectTime, 3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here and below, lowerCameCase please
src/Client.php
Outdated
| 'host' => $host, | ||
| 'total_connection_time_ms' => round($selectTime, 3), | ||
| 'select_wait_time_ms' => round($selectTime - $connectTime, 3), | ||
| 'pid' => getmypid(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here
|
Updated |
| $except = []; | ||
| $selectStart = microtime(true); | ||
| $selectResult = socket_select($read, $write, $except, $this->connectionTimeout, $this->connectionTimeoutMicroseconds); | ||
| // Time for socket_select call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NAB: Missing empty line before comment
Explanation of Change
When a connection to bedrock takes a little bit longer, wait for it to be ready by setting the socket to non blocking mode on startup and then blocking before actually using it.
Related Issues
https://github.com/Expensify/Expensify/issues/544983
Deployment