Skip to content

Enable async capabilities for HTTP requests #8391

@francoisfreitag

Description

@francoisfreitag

Is your feature request related to a problem? Please describe.
From #6629, the linkcheck command is currently using threads to concurrently check the status of links in the documentation. Threads are not the most efficient way to concurrently check links: once all threads are busy waiting, the work queue stops being consumed.

Describe the solution you'd like

Using an event loop allows the URL verifier to yield control to another coroutine until it gets a response. That means a single thread is able to send multiple requests concurrently and process the response as they arrive. It also facilitates handling rate-limiting, because a coroutine can be scheduled to run in the future.

The first step toward using asynchronous concurrency in linkchecker is to replace the requests library uses with an async-compatible HTTP library. The aiohttp library has an API pretty similar to that of requests and is well-established and under active development, it seems like a good choice.

Describe alternatives you've considered

Tried handling rate limits with a PriorityQueue as described in #6629 (comment).

TODO

  • Increase test coverage of the existing code
  • Use a wrapper that makes async code synchronous for sync use cases (get event loop, queue the request, run event loop until complete).
  • Adapt existing code that expects a requests.Response to use an aiohttp.ClientResponse. Both look pretty similar.
  • Make a compatibility wrapper for arguments where a requests object was expected and aiohttp expects a different input. Consider REQUESTS_CA_BUNDLE, tls_cacerts, auth_info for the linkcheck_auth setting.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions