Skip to content

feat: Reprovide Sweep #1082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 126 commits into
base: master
Choose a base branch
from
Draft

feat: Reprovide Sweep #1082

wants to merge 126 commits into from

Conversation

guillaumemichel
Copy link
Contributor

@guillaumemichel guillaumemichel commented May 6, 2025

Note

This PR may be replaced by

Summary

Problem

Reproviding many keys to the DHT one by one is inefficient, because it requires a GetClosestPeers (or GCP) request for every key.

Current state

Currently, reprovides are managed in boxo/provider. Every ReprovideInterval (22h in Amino DHT), all keys matching the reprovide strategy are reprovided at once. The process is slightly different depending on whether the accelerated DHT client is enabled.

Default DHT client

All the keys are reprovided sequentially, using the go-libp2p-kad-dht Provide() method. This operation consists in finding the k closest peers to the given key, and then request them all to store the associated provider record.

The process is expensive because it requires a GCP for each key (opening approx. 20-30 connections). Timeouts due to unreachable peers make this process very long, resulting in a mean of ~10s in provide time (source: probelab.io 2025-06-13).

dht-publish-performance-overall

With 10 seconds per provide, a node using this process could reprovide less than 8'000 keys over the reprovide interval of 22h (using a single thread).

Accelerated DHT client (fullrt)

The accelerated DHT client periodically (every 1h) crawls the DHT swarm to cache the addresses of all discovered peers. It allows it to skip the GCP during the provide request, since it already knows the k closest peers and the associated multiaddrs.

Hence, the accelerated DHT client is able to provide much more keys during the reprovide interval compared with the default DHT client. However, crawling the DHT swarm is an expensive operation (networking, memory), and since all the keys are reprovided at once, the node will experience a bust period until all keys are reprovided.

Ideally, nodes wouldn't have to crawl the swarm to reprovide content, and the reprovide operation could be smoothed over time to avoid a bust during which the libp2p node is incapable of performing other actions.

Pooling Reprovides

If there are more keys to be reprovided than the number of nodes in the DHT swarm divided by the replication factor (k), then it means that there are at least two keys that will be provided to the exact same set of peers. This means that the number of GCP is less than the number of keys to reprovide.

For the Amino DHT, containing ~10k DHT servers and having a replication factor of 20, pooling reprovides becomes efficient starting from 500 keys.

Reprovide Sweep

The current process of reproviding all keys at once is bad because it creates a bust. In order to smooth the reprovide process, we can sweep the keyspace from left to right, in order to cover all peers over time. This consists of exploring keyspace regions, corresponding to a set of peers that are close to each other in the Kademlia XOR distance metric.

⚠️ The Kademlia keyspace in NOT linear

A keyspace region is explored using a few (typically 2-4 GCP) to discover all the peers it contains. A keyspace region can be identified by a Kademlia identifier prefix, the kademlia identifiers of all peers within this region start with the region's prefix.

Once a region is fully explored, all the keys matching the keyspace region's prefix can be allocated to this set of peers. No additional GCP is needed.

Implementation

This PR contains an implementation of the Reprovide Sweep strategy. The SweepingReprovider basically does the following:

  • Expose Provide() and ProvideMany() methods. All cids passed through this methods are provided to the DHT as expected.
  • All cids that are given through the above methods are stored in a trie. The reprovider implementation keeps a state of all cids it is responsible for reproviding.
  • The reprovider schedules when reprovides should happen for each keyspace region (for which there is at least 1 cid). Region reprovides are spread evenly over the reprovide interval.
  • Once the time to reprovide a region has come, the reprovider explores the region, and allocate the provider records corresponding to the cids belonging to this region to the appropriate peers.

Features

  • Concurrency limit
    • Ability to configure the number of workers both for i) initial provide operation and ii) regular reprovides
    • Limit the number of connection that a worker can open
  • Parallel reprovide
    • If a reprovide isn't complete and it is time for the next one, the next one can start already given there are some available workers
  • Error handling
    • If a cid or a complete region couldn't be provided, the operation will be retried later until it succeeds
  • Connectivity checker
    • The reprovider will check connectivity on provide failure, and won't try to provide as long as the node is offline.
    • When the node comes back online, the activity resumes with (re)providing the cids/regions that should have been provided during the down time.
  • Dynamic prefix length estimation
    • When starting up, the reprovider doesn't know how many peers are included in a region. It will hence make a few GCP requests to estimate the initial prefix length for exploring regions.
  • Reset reprovided cids
    • Offer a ResetReprovideSet method to replace the cids that must be reprovided.

Missing features

  • Store keys to reprovide in Datastore instead of memory.
    • Currently a trie.Trie in memory containing all cids to be reprovided
    • Ideally move the trie to Datastore
    • Keys can be grouped by region/prefix if it helps
      • Anyway they will be loaded by region
      • Not sure if adding just 1 key to a group is easy
  • (optional) Persist when a region is reprovided to the datastore (region prefix, timestamp, e.g prefix -> timestamp).
    • Allows resuming reprovides after a crash/shutdown and start by catching up regions that should have been provided during the down time.
      • For this it may be useful to save the last reprovided region (e.g lastProvided -> [prefix, timestamp])
    • Only store the last time a region was reprovided, everytime the region is reprovided we can override the older timestamp
    • Storing timestamp about individual provides would help kubo users know the last time a cid was provided. (e.g cid -> timestamp). These can {expire, be garbage collected} after reprovideInterval.
  • (optional) Persist provide and reprovide queues to datastore
    • Don't loose pending cids on restart
  • Refactor pending cids queue
    • Mix failed cids with cids that were just added using Provide()
    • Allows to group close cids together to provide more efficiently
    • We may loose prioritization (e.g calling Provide(cidA) before Provide(cidB) doesn't mean that cidA will be provided before cidB)
  • [ ] The Dual DHT (used by Kubo) currently has 1 SweepingReprovider for each DHT (LAN and WAN)
    • Allow the SweepingReprovider to (re)provide content to multiple DHT swarms with a single scheduler and cids store (trie)
    • It means that pending regions/cids have to be distinct for each swarm since provide could work for a swarm, but fail in another one
    • It will probably require multiple ConnectivityCheckers, one for each DHT swarm.
    • This isn't useful since the schedule depends on the network size. Hence each network should have its own schedule.
    • The only thing that can be shared between the 2 ReprovideSweepers is the set of cids that needs to be reprovided (datastore).
  • If we decide to change the routing/provide interfaces in kubo get rid of the boxo/provider.System implementation in go-libp2p-kad-dht/dual/reprovider.go
  • (optional) Provide status provider: ProvideStatus interface #1110

TODO

  • Complete implementation with missing mandatory features
  • Implementation review Reprovide Sweep review #1095
  • (optional) Increase unit & integration test coverage
  • (optional) Increase amino DHT test coverage
  • Benchmark performance vs default and accelerated DHT clients
  • (optional) High level documentation in go-libp2p-kad-dht/reprovider/README.md
  • Integration in kubo feat: DHT Reprovide Sweep ipfs/kubo#10834
    • This one is going to be long and painful 😢

Admin

Depends on:

Need new release of:

Closes #824

Part of ipshipyard/roadmaps#6, ipshipyard/roadmaps#7, ipshipyard/roadmaps#8

@guillaumemichel guillaumemichel force-pushed the reprovide-sweep branch 2 times, most recently from 6cad84d to 9bd54ba Compare May 21, 2025 14:54
@lidel lidel mentioned this pull request May 21, 2025
46 tasks
@guillaumemichel guillaumemichel force-pushed the reprovide-sweep branch 2 times, most recently from fa69218 to 3642888 Compare July 24, 2025 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reprovide Sweep
1 participant