-
Notifications
You must be signed in to change notification settings - Fork 20.9k
eth: add logic to drop peers randomly when saturated #31476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If you want to track the sync status of the node, this code will have to live outside of package p2p. It'll likely need to be created from |
Dropping peers randomly with a slow pace to create some artificial churn. Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Better positioned in package eth to access relevant data about connection quality.
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
let's rename type to |
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
eth/dropper.go
Outdated
// Interval between peer drop events (uniform between min and max) | ||
peerDropIntervalMax = 7 * time.Minute | ||
// Avoid dropping peers for some time after connection | ||
doNotDropBefore = 2 * peerDropIntervalMax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is too high. At the current settings, it means each connection gets at least 14min of being peered before we consider it. I think it should be independent of the interval and a bit smaller, like 5min.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe 10min? Either way it's not intuitive for me why this should be connected to the interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. The protection period is meant for peers to have time to start contribution, but at the moment we have no logic related to "start contribute", so the number is a bit arbitrary.
The logic in connecting the two intervals is from the other side of things: for not protecting a new node too long. What I wanted to avoid is protecting a node for too long, while the whole old peerset can be changed. Actually, for this it would be better to use peerDropIntervalMin ....
Let's make it independent for now, then we can consider again later. I would like this number to be bigger than the time a the dropped peer needs to find a replacement. 10min sounds safe to start with, then we can checked the effect when it is rolled out, and adjust.
// How close to max should we initiate the drop timer. O should be fine, | ||
// dropping when no more peers can be added. Larger numbers result in more | ||
// aggressive drop behavior. | ||
peerDropThreshold = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe best to remove this setting. I wouldn't want people to tinker with this, since a non-zero setting here can lead to weird behavior if peer finding efficiency is bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing this will not change the drop interval, which ultimately drives churn in the system.
Changing this is somewhat similar to changing maxpeercount, which we do allow people to change (even on command line).
The effect is different on the dialed and on the served pool.
- On the dialed pool, it leads slightly more aggressive dial after drop, because dial slots are adaptive.
- On the served pool, it leads to more free slots in the system, which is largely needed, but needs careful checks before we turn this up.
So I would prefer keeping it to simplify experimenting with it for us.
The protection period is meant for peers to have time to start contribution, but at the moment we have no logic related to "start contribute", so the number is a bit arbitrary. The logic in setting this based on the drop interval is from the other side of things: for not protecting a new node too long. What I wanted to avoid is protecting a node for too long, while the whole old peerset can be changed. Actually, for this it would be better to use peerDropIntervalMin .... Let's make it independent for now, then we can consider again later. I would like this number to be bigger than the time a the dropped peer needs to find a replacement. 10min sounds safe to start with, then we can checked the effect when it is rolled out, and adjust. Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
As of now, Geth disconnects peers only on protocol error or timeout, meaning once connection slots are filled, the peerset is largely fixed. As mentioned in ethereum#31321, Geth should occasionally disconnect peers to ensure some churn. What/when to disconnect could depend on: - the state of geth (e.g. sync or not) - current number of peers - peer level metrics This PR adds a very slow churn using a random drop. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com> Co-authored-by: Felix Lange <fjl@twurst.com>
As of now, Geth disconnects peers only on protocol error or timeout, meaning once connection slots are filled, the peerset is largely fixed. As mentioned in ethereum#31321, Geth should occasionally disconnect peers to ensure some churn. What/when to disconnect could depend on: - the state of geth (e.g. sync or not) - current number of peers - peer level metrics This PR adds a very slow churn using a random drop. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com> Co-authored-by: Felix Lange <fjl@twurst.com>
As of now, Geth disconnects peers only on protocol error or timeout, meaning once connection slots are filled, the peerset is largely fixed. As mentioned in ethereum#31321, Geth should occasionally disconnect peers to ensure some churn. What/when to disconnect could depend on: - the state of geth (e.g. sync or not) - current number of peers - peer level metrics This PR adds a very slow churn using a random drop. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com> Co-authored-by: Felix Lange <fjl@twurst.com>
As of now, Geth disconnects peers only on protocol error or timeout, meaning once connection slots are filled, the peerset is largely fixed. As mentioned in ethereum#31321, Geth should occasionally disconnect peers to ensure some churn. What/when to disconnect could depend on: - the state of geth (e.g. sync or not) - current number of peers - peer level metrics This PR adds a very slow churn using a random drop. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com> Co-authored-by: Felix Lange <fjl@twurst.com>
WIP: this is still wip for a few reasons:
As of now, Geth disconnects peers only on protocol error or timeout, meaning once connection slots are filled, the peerset is largely fixed.
As mentioned in #31321, Geth should occasionally disconnect peers to ensure some churn. What/when to disconnect could depend on:
This PR adds a very slow churn using a random drop.