You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When looking to rerun a batch of failed jobs that I have stored using the DynamoDB Failed Job driver, I realized I could not rerun all jobs from a certain queue. Upon further investigation, this is appears to be due to querying dynamodb like a relational DB assuming that query() returns all records and not accounting for technical restrictions of dynamodb.
PROBLEM:
Within all() method: Dynamodb only returns 1mb of results per ->query() call. Since the query is returning 'ALL_ATTRIBUTES' and that includes payloads, the ->query ends up returning very few records. As there is no pagination and it is ordered by uuid, this function receive s1mb of random records.
The ids function is using the results of ->all() and then filtering. This results in only getting failed jobs from a queue that happen to be within the 1mb of data returned by all().
We can see in this example searching for a certain set of jobs among 'all' failures on a certain queue. There seem to be not many results for this query. Checking dynamodb directly, there are many more than 50 results and not the small amount returned by the artisan commands.
Proposition:
Create a custom method which can be used with ids() that supports querying dynamodb properly. This will result in 1mb of relevant results returned.
Improve upon item 1 and specify just the id attribute for the return so that a significant larger amount of results should be returned.
It would be nice for all to return results with latest results up front. With the addition of ordered uuids, this could be achieved by modifying the uuids from random uuids to ordered uuids
It would be nice for ->all() to be consistent and paginate through the entire dynamo table and actually return all records. This has some risks for the end user as those tables can be large and a limit may be good here.
I would be happy to implement these updates, but have not contributed before and would like some guidance.
Hope to hear from the community soon.
/**
* Get the IDs of all of the failed jobs.
*
* @param string|null $queue
* @return array
*/
public function ids($queue = null)
{
return (new Collection($this->all()))
->when(! is_null($queue), fn ($collect) => $collect->where('queue', $queue))
->pluck('id')
->all();
}
/**
* Get a list of all of the failed jobs.
*
* @return array
*/
public function all()
{
$results = $this->dynamo->query([
'TableName' => $this->table,
'Select' => 'ALL_ATTRIBUTES',
'KeyConditionExpression' => 'application = :application',
'ExpressionAttributeValues' => [
':application' => ['S' => $this->applicationName],
],
'ScanIndexForward' => false,
]);
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello Laravel Community,
When looking to rerun a batch of failed jobs that I have stored using the DynamoDB Failed Job driver, I realized I could not rerun all jobs from a certain queue. Upon further investigation, this is appears to be due to querying dynamodb like a relational DB assuming that query() returns all records and not accounting for technical restrictions of dynamodb.
PROBLEM:
We can see in this example searching for a certain set of jobs among 'all' failures on a certain queue. There seem to be not many results for this query. Checking dynamodb directly, there are many more than 50 results and not the small amount returned by the artisan commands.
Proposition:
I would be happy to implement these updates, but have not contributed before and would like some guidance.
Hope to hear from the community soon.
Beta Was this translation helpful? Give feedback.
All reactions