-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Context
It's becoming increasingly clear based on real world testing that someguy could benefit from various optimisations, e.g. internal caching to improve performance.
Experience running Someguy as a public good has given us insight into some known problems, trade-offs, and potential improvements.
The primary use case for someguy is short-lived and resource constrained IPFS client, e.g. browsers.
This issue aims to capture what we know, the difference aspects of performance in addition to linking to the relevant sub-issues.
Metrics
In order to improve any aspect, we need to have better observability so we can confirm theories about why requests are slow.
- feat: add tracing, request timeouts and move instrumentation to boxo #87: Adds tracing so we can see where time is spent handling requests
- feat(routing/http/server): expose prometheus metrics boxo#718: Improves the metrics to be endpoint specific so we have more granularity.
nd-json vs. json
Someguy supports both json
(default) and nd-json
content types. For almost all use-cases, nd-json performs better because results, e.g. providers, are streamed to the user as they are discovered through DHT traversal. This is also the default in the helia delegated routing client
The only disadvantage of nd-json, is that results cannot be cached by Cloudflare CDN.
JSON and internal timeouts
The main problem observed with the default JSON responses, is that someguy has neither timeouts for incoming http request nor DHT/IPNI routing requests, which leads clients to hang until a timeout in one of the intermediate layers (nginx or Cloudflare for the public endpoint).
It's probably sensible to add an upper bound on routing requests:
- Less hanging in-flight requests (an attack vector)
- It will result in faster JSON responses
- There are anyway diminishing returns on waiting longer when finding providers, you may find more, but at the cost of the client giving up on trying.
HTTP Caching
Since someguy exposes an HTTP API, HTTP caching provides a robust and composable mechanism to improve query performance. Someguy sets the cache-control headers as defined in the spec, which are respected by nginx and Cloudflare.
Active peer probing and internal caching
The peer store in go-libp2p caches peer records for 15 minutes.
Provider routing requests often return peers without any multiaddrs. This forces clients to make a separate peer routing request.
#53 explores the idea of adding active peer probing