-
I'm experiencing connection issues when peers start simultaneously, and I'm unclear about the proper way to manage peer addresses across multiple behaviors. SetupI have a behavior combining API Usage Question
// For mDNS discovered peers, do I need to explicitly add it to the swarm AND kademlia?
match event {
SwarmEvent::Behaviour(MyBehaviourEvent::Mdns(mdns::Event::Discovered(list))) => {
for (peer_id, multiaddr) in list {
swarm.add_peer_address(peer_id, multiaddr.clone());
swarm.behaviour_mut().kad.add_address(&peer_id, multiaddr); // Still needed?
}
}
} My Current ImplementationTo avoid manual kademlia address management, I implemented this in my wrapper's fn on_swarm_event(&mut self, event: FromSwarm<'_>) {
match event {
FromSwarm::NewExternalAddrOfPeer(NewExternalAddrOfPeer { peer_id, addr }) => {
self.kademlia.add_address(&peer_id, addr.clone());
}
FromSwarm::DialFailure(DialFailure {
peer_id: Some(peer_id),
error: DialError::Transport(errors),
..
}) => {
for (addr, _) in errors {
self.kademlia.remove_address(&peer_id, addr);
}
}
_ => {}
}
self.kademlia.on_swarm_event(event)
} This is replicating the code I found here: rust-libp2p/swarm/src/behaviour/peer_addresses.rs Lines 24 to 41 in 1aa4337 The ProblemThis works when peers start at different times, but when two peers start simultaneously, I consistently get:
This Questions
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
A simple workaround I implemented for now is to delay calling pub struct Behaviour {
kademlia: kad::Behaviour<kad::store::MemoryStore>,
pending_peers: HashMap<(PeerId, Multiaddr), Instant>,
}
impl NetworkBehaviour for Behaviour {
// ...
fn on_swarm_event(&mut self, event: FromSwarm<'_>) {
match event {
FromSwarm::ConnectionEstablished(ConnectionEstablished {
peer_id,
failed_addresses,
..
}) => {
self.pending_peers.retain(|(pending_peer_id, addr), _| {
// Keep all entries for different peers
pending_peer_id != &peer_id
// OR keep same-peer entries that failed
|| failed_addresses.iter().any(|failed_addr| failed_addr == addr)
});
}
FromSwarm::NewExternalAddrOfPeer(NewExternalAddrOfPeer { peer_id, addr }) => {
self.kademlia.add_address(&peer_id, addr.clone());
}
FromSwarm::DialFailure(DialFailure {
peer_id: Some(peer_id),
error: DialError::Transport(errors),
..
}) => {
let now = Instant::now();
for (addr, _) in errors {
self.pending_peers
.entry((peer_id, addr.clone()))
.or_insert(now);
}
}
_ => {}
}
self.kademlia.on_swarm_event(event)
}
fn poll(
&mut self,
cx: &mut task::Context<'_>,
) -> task::Poll<ToSwarm<Self::ToSwarm, THandlerInEvent<Self>>> {
let failed_peers = self
.pending_peers
.extract_if(|_, failed_at| failed_at.elapsed() > Duration::from_secs(5));
for ((peer_id, addr), _) in failed_peers {
self.kademlia.remove_address(&peer_id, &addr);
}
// ...
}
} Any advice on if this approach is good or has issues, please let me know. I believe the |
Beta Was this translation helpful? Give feedback.
A simple workaround I implemented for now is to delay calling
self.kademlia.remove_address(&peer_id, addr);
until 5 seconds later. And if within those 5 seconds, the a connection is established with that peer, then it wont be removed anymore.