Ensuring the bootstrap node is up. #4876
-
Hi all, I have an application with the following behaviour #[derive(NetworkBehaviour)]
#[behaviour(to_swarm = "NetworkBehaviourEvent")]
pub(crate) struct MyNetworkBehaviour {
pub identify: identify::Behaviour,
pub kademlia: kad::Behaviour<MemoryStore>,
pub request_response: request_response::Behaviour<MessageExchangeCodec>,
}
impl MyNetworkBehaviour {
pub fn new(
keypair: &identity::Keypair,
bootstrap_peers: Option<Vec<Multiaddr>>,
) -> Result<Self, MyNetworkBehaviourError> {
let identify = identify::Behaviour::new(identify::Config::new(
IDENTIFY_PROTO_NAME.to_string(),
keypair.public(),
));
let peer_id = keypair.public().to_peer_id();
let mut kademlia = {
let config = kad::Config::default()
.set_protocol_names(vec![StreamProtocol::new(KADEMLIA_PROTO_NAME)]);
let mut kademlia =
kad::Behaviour::with_config(peer_id, MemoryStore::new(peer_id), config);
kademlia.set_mode(Some(kad::Mode::Server));
kademlia
};
if let Some(bootstrap_peers) = bootstrap_peers {
// Add the addresses of the bootstrap nodes to our view of the DHT.
for peer_address in &bootstrap_peers {
let peer_id = Self::extract_peer_id_from_multiaddr(peer_address)?;
kademlia.add_address(&peer_id, peer_address.clone());
}
// Add our own info to the DHT.
kademlia
.bootstrap()
.map_err(|err| MyNetworkBehaviourError::BootstrapError(err.to_string()))?;
}
Ok(Self {
identify,
kademlia,
request_response: request_response::Behaviour::new(
[(MessageExchangeProtocol(), ProtocolSupport::Full)],
Default::default(),
),
})
}
...
} This setup works pretty well: nodes can connect through the bootstrap peers, learn about each other and send direct messages. However, if a bootstrap node is not online yet, I get, as expected, errors. I wonder, is there an idiomatic way to ensure that the bootstrap node is online? E.g., do repeated manual dialling first or provide a configuration option somewhere specifying a connection retry policy. Any general suggestions are also welcome @thomaseizinger. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
I am assuming you don't mean that literally because there is no way we can "ensure" that another peer is online. We can observe whether they are or not and react as a result. Depending on the usecase of your network, a more general piece of functionality is could be to have an algorithm that aims for a being connected to a certain number of peers. If you fall below the threshold, the algorithm would attempt to establish new connections based on a random set of peers. Once you have connections to these peers, calling bootstrap regularly (also see #4838), would ensure that you learn about other peers in the network and build up a longer list of peers that you could again sample from if you want more connections. For overall network health, you probably don't want every node to maintain a connection to the bootstrap nodes. Again, it depends a lot on what you network of nodes does. Perhaps @jxs or @AgeManning can add some input here in how Lighthouse achieves its target peer count and what the strategies are there. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detailed explanation!
I ended up refactoring some of my code enabling me to observe "connection refused" messages and implement a bootstrap retry if a bootstrap peer refused the connection. |
Beta Was this translation helpful? Give feedback.
I am assuming you don't mean that literally because there is no way we can "ensure" that another peer is online. We can observe whether they are or not and react as a result.
Depending on the usecase of your network, a more general piece of functionality is could be to have an algorithm that aims for a being connected to a certain number of peers. If you fall below the threshold, the algorithm would attempt to establish new connections based on a random set of peers.
Once you have connections to these peers, calling bootstrap regula…