-
Notifications
You must be signed in to change notification settings - Fork 249
Description
Context
Currently the Reprovide Operation is triggered by Kubo for each Provider Record. Kubo periodically (every 22h) republished Provider Records using go-libp2p-kad-dht
Provide method.
Line 373 in b95bba8
func (dht *IpfsDHT) Provide(ctx context.Context, key cid.Cid, brdcst bool) (err error) { |
The DHT Provide method consists in performing a lookup request to find the 20 closest peers to the CID, open a connection to these peers and allocate them the Provider Record.
The problem
This means that for every Provider Record that a node is advertising, 1 lookup request needs to be performed every 22h, and 20 connections need to be opened. This may seem fine for small providers, however this is terrible for large Content Providers. The Reprovide operation is certainly the reason most large Content Provider don't use the DHT, and IPFS is forced to keep the infamous Bitswap broadcast. Improving the Reprovide operation would allow large Content Providers to advertise their content to the DHT. Once most of the content is published on the DHT, Bitswap broadcast can be significantly reduced. This is expected to significantly cut off the price of hosting content on IPFS, because all peers in the network won't get spammed with requests for CIDs they don't host.
Solution overview
By the pigeonhole principle, if a Content Provider is providing content for x
CIDs, with
Without entering too much into details, all Provider Records are grouped by XOR proximity in the keyspace. All Provider Records in a group are allocated to the same set of DHT Servers. Perdiodically, the Content Provider sweeps the keyspace from left to right and reprovides the Provider Records corresponding to the visited keyspace region.
For a Content Provider providing 100K CIDs, and 25K DHT Servers the expected improvement is ~80x
.
More details can be found on the WIP Notion document.
How to implement it
The Reprovide operation responsibility should be transferred from Kubo to the DHT implementation. This is generally desired because different Content Routers may have different reprovide logic, that kubo is unaware of, or cannot optimize for.
go-libp2p-kad-dht
(and other Content Routers) should expose StartProviding(CID)
, StopProviding(CID)
methods instead of the Provide(CID)
method. Kubo then only needs to pass to the DHT which CIDs should be provided or not.
A lot of refactoring needs to happen around go-libp2p-routing-helpers.