Improve `sameDomainDelaySecs` implementation

- The current implementation (added in #2003) essentially holds requests to recently accessed domains in memory and re-enqueues them after a given timeout
- ⚠️ this may result in lots of unnecessary request queue writes
- ⚠️ there may be unexpected results if this is used in conjunction with request locking
    - but resolving this for multiple request queue consumers would be quite an endeavor
- we cannot simply use `maxRequestsPerMinute` - we don't know the request URL (and thus the domain) before receiving it from the queue

Regardless of the resolution, the "final" version should be ported over to crawlee-python.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve `sameDomainDelaySecs` implementation #3148

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve sameDomainDelaySecs implementation #3148

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Improve `sameDomainDelaySecs` implementation #3148