-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Is your feature request related to a problem? Please describe.
From #6629, the linkcheck
command is currently using threads to concurrently check the status of links in the documentation. Threads are not the most efficient way to concurrently check links: once all threads are busy waiting, the work queue stops being consumed.
Describe the solution you'd like
Using an event loop allows the URL verifier to yield control to another coroutine until it gets a response. That means a single thread is able to send multiple requests concurrently and process the response as they arrive. It also facilitates handling rate-limiting, because a coroutine can be scheduled to run in the future.
The first step toward using asynchronous concurrency in linkchecker
is to replace the requests
library uses with an async-compatible HTTP library. The aiohttp
library has an API pretty similar to that of requests
and is well-established and under active development, it seems like a good choice.
Describe alternatives you've considered
Tried handling rate limits with a PriorityQueue as described in #6629 (comment).
TODO
- Increase test coverage of the existing code
- Use a wrapper that makes async code synchronous for sync use cases (get event loop, queue the request, run event loop until complete).
- Adapt existing code that expects a
requests.Response
to use anaiohttp.ClientResponse
. Both look pretty similar. - Make a compatibility wrapper for arguments where a requests object was expected and
aiohttp
expects a different input. ConsiderREQUESTS_CA_BUNDLE
,tls_cacerts
,auth_info
for thelinkcheck_auth
setting.