Skip to content

Conversation

@kannanjgithub
Copy link
Contributor

Backoff timer expiry reset backoffs and update RLS picker, and RLS connectivity state change from TRANSIENT_FAILURE to READY reset all backoffs.

Fix test.

Unit test.

Save changes.

In-progress changes for control plane connectivity state monitoring.

Test for making backoff timer expiry not make a RLS rpc directly but update the parent LB policy with picker after deleting backoff cache entry.

Save in-progress changes.

Save changes.
@kannanjgithub kannanjgithub requested a review from ejona86 October 31, 2025 14:32
Remove unused class.
}
}
rlsChannel.notifyWhenStateChanged(currentState, () -> rlsServerConnectionStateChanged());
lastRlsServerConnectivityState = currentState;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused.

Comment on lines +222 to +224
lastRlsServerConnectivityState = rlsChannel.getState(false);
rlsChannel.notifyWhenStateChanged(
lastRlsServerConnectivityState, () -> rlsServerConnectionStateChanged());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is assuming the current state, as the booleans aren't being updated. I think the safer thing to do is explicitly assume the current state, and the callback will run if it is different. So then you can initialize the variables assuming that initial state. IDLE would be a good guess for the initial state, and you can just pass that explicitly to notifyWhenStateChanged() here

private final RefCountedChildPolicyWrapperFactory refCountedChildPolicyWrapperFactory;
private final ChannelLogger logger;
private final ChildPolicyWrapper fallbackChildPolicyWrapper;
private ConnectivityState lastRlsServerConnectivityState;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to have a comment describing the threading behavior for these variables. (e.g., "Only accessed in rlsServerConnectionStateChanged() callback").

"[RLS Entry {0}] Transition to back off: status={1}, delayNanos={2}",
request, status, delayNanos);
BackoffCacheEntry entry = new BackoffCacheEntry(request, status, backoffPolicy);
BackoffCacheEntry entry = new BackoffCacheEntry(request, status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we no longer do exponential backoff? I see your PR description says "Backoff timer expiry reset backoffs", but why would we want to do that? The "Failed Requests" section of the doc talks about the backoff_expiration_time_ being set for exponential backoff. We need to record the exponent somewhere while waiting for the time to expire, and it seems the BackoffCacheEntry is appropriate for that.

If you followed the code flow, I think you'd see that with the change here it'd cause the backoffPolicy argument to createBackoffEntry to always be null. refreshBackoffEntry() is the only place non-null was passed.

}
logger.log(ChannelLogLevel.DEBUG,
"[RLS Entry {0}] Calling RLS for transition to pending", entry.request);
linkedHashLruCache.invalidate(entry.request);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalidating will cause us to forget the exponential backoff state, so if the next attempt fails we'll start over at the beginning. We need to record the exponent somewhere until the request is re-attempted. Keeping it in the cache seems fair, although we'll need to check for expired BackoffCacheEntrys after calling linkedHashLruCache.read().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants