-
Notifications
You must be signed in to change notification settings - Fork 3
Description
galah
already implements rate-limiting (using purrr::rate_backoff()
) for some queries, notably repeated calls to check download status with collect.data_request()
when wait = TRUE
. atlas_counts()
also has a feature where the number of facets is limited, which implicitly limits the number of queries. What this doesn't do is limit the speed of repeated queries in a loop.
The best options to solve this problem is to support large or complex downloads in a single call; and to provide documentation on how to achieve that, so that users don't need to call loops at all. But it would also be useful to build in rate-limiting that could detect and modify the speed of repeated calls made using galah
(e.g. using for
loops, purrr::map()
or lapply()
).
One solution here would be to add some limited query caching to galah_config()
. If this recorded the timestamp and the function for the last ~10 queries, we could start adding a delay once the rate went too high, say >1 per second. Exactly what constitutes "too high", and whether it varies between e.g. downloads and counts, is ambiguous at this stage. It would also need to be adaptive, in the same way as rate_backoff()
, such that attempts to get around it by e.g. parallelisation would be detected and slowed.