Skip to content

ResultIterator.last_evaluated_key appears to be populated only after iteration / is None before iteration #1266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jaeyoung0509 opened this issue Apr 22, 2025 · 2 comments

Comments

@jaeyoung0509
Copy link

jaeyoung0509 commented Apr 22, 2025

Envrionment

  • Pynamodb: 6.0.2
  • Boto3: 1.34.138
  • BotoCore: 1.34.144
  • python: 3.9

Description

  • I encountered unexpected behavior regarding the last_evaluated_key attribute of the ResultIterator returned by Model.sacn() when implementing manual pagination.
  • It seems that the last_evaluated_key attribute is correctly populated with the value from the Dynamodb response only after the iterator has been consumed(eg by calling list) or iterating through it in a for loop
  • Accessing results.last_evaluated_key immediately after the scan() call returns, but before consuming the results iterator, consistently yielded None in my tests, even when the underlying DynamoDB response should have contained a non-null LastEvaluatedKey (indicating more pages are available). This prevents manual pagination logic that checks the key before processing the items.

Example

  1. Code order where pagination fails
results = MyModel.scan(filter_condition=filter_condition, limit=limit)
        last_evaluated_key = results.last_evaluated_key
        all_items.extend(list(results))
        while last_evaluated_key: #<-- evaluted as none
            scan_result = MyModel.scan(
                last_evaluated_key=last_evaluated_key,
                filter_condition=filter_condition,
                limit=limit,
            )
            last_evaluated_key = results.last_evaluated_key
            all_items.extend(list(scan_result))
  1. Code order where pagination works:
results = MyModel.scan(filter_condition=filter_condition, limit=limit)
        all_items.extend(list(results))
        last_evaluated_key = results.last_evaluated_key
        while last_evaluated_key:
            scan_result = MyModel.scan(
                last_evaluated_key=last_evaluated_key,
                filter_condition=filter_condition,
                limit=limit,
            )
            all_items.extend(list(scan_result))
            last_evaluated_key = scan_result.last_evaluated_key
@ikonst
Copy link
Contributor

ikonst commented Apr 23, 2025

I think it's how it was planned to work, according to this comment:

# Not started iterating yet: return `exclusive_start_key` if set, otherwise expect None; or,
# Entire page has been consumed: last_evaluated_key is whatever DynamoDB returned
# It may correspond to the current item, or it may correspond to an item evaluated but not returned.
return self.page_iter.last_evaluated_key

This could've been better documented, but I also think it's hard to believe someone should rely on scan not making the API call until you start iterating, so perhaps we can change this behavior.

@jaeyoung0509
Copy link
Author

@ikonst
Thanks for the quick response and clarification! I understand the reasoning behind the lazy loading now, based on the comment you linked.
I agree with your assessment that it's "hard to believe someone should rely on scan not making the API call until you start iterating".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants