-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[Consensus] subscriber counter to atomically set node status #19313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… 0 only when there is no other subscription for the authority
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
.set(1); | ||
// Failure can only be due to core shutdown. | ||
let _ = subscription_counter.increment(); | ||
let _ = subscription_counter.increment(peer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the error and panic if it is not shutdown?
.set(0); | ||
// Failure can only be due to core shutdown. | ||
let _ = self.subscription_counter.decrement(); | ||
let _ = self.subscription_counter.decrement(self.peer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, check the error and panic if it is not shutdown?
## Description Looking on the connectivity metrics for the subscribed peers I believe it's possible to have some race conditions when nodes quickly connect/disconnect. As the metric is a gauge the last who sets the value "wins", so it's possible that we might have nodes connecting again while the earlier connection has not been dropped yet - which consequently once dropped it will make the peer appear in metrics as disconnected. The PR is refactoring a bit that part and only sets the peer as disconnected when there is no other pending connection. ## Test plan CI --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:
Description
Looking on the connectivity metrics for the subscribed peers I believe it's possible to have some race conditions when nodes quickly connect/disconnect. As the metric is a gauge the last who sets the value "wins", so it's possible that we might have nodes connecting again while the earlier connection has not been dropped yet - which consequently once dropped it will make the peer appear in metrics as disconnected. The PR is refactoring a bit that part and only sets the peer as disconnected when there is no other pending connection.
Test plan
CI
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.